**Neuromethods 191**

## Eirini Papagiakoumou *Editor*

# All-Optical Methods to Study Neuronal Function

## NEUROMETHODS

Series Editor Wolfgang Walz University of Saskatchewan Saskatoon, SK, Canada

For further volumes: http://www.springer.com/series/7657 Neuromethods publishes cutting-edge methods and protocols in all areas of neuroscience as well as translational neurological and mental research. Each volume in the series offers tested laboratory protocols, step-by-step methods for reproducible lab experiments and addresses methodological controversies and pitfalls in order to aid neuroscientists in experimentation. Neuromethods focuses on traditional and emerging topics with wide-ranging implications to brain function, such as electrophysiology, neuroimaging, behavioral analysis, genomics, neurodegeneration, translational research and clinical trials. Neuromethods provides investigators and trainees with highly useful compendiums of key strategies and approaches for successful research in animal and human brain function including translational "bench to bedside" approaches to mental and neurological diseases.

## All-Optical Methods to Study Neuronal Function

Edited by

## Eirini Papagiakoumou

Institut de la Vision, Sorbonne Université, INSERM, CNRS, Paris, France

Editor Eirini Papagiakoumou Institut de la Vision Sorbonne Universite´, INSERM, CNRS Paris, France

ISSN 0893-2336 ISSN 1940-6045 (electronic) Neuromethods ISBN 978-1-0716-2763-1 ISBN 978-1-0716-2764-8 (eBook) https://doi.org/10.1007/978-1-0716-2764-8

© The Editor(s) (if applicable) and The Author(s) 2023

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer Nature.

The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.

## Preface to the Series

Experimental life sciences have two basic foundations: concepts and tools. The Neuromethods series focuses on the tools and techniques unique to the investigation of the nervous system and excitable cells. It will not, however, shortchange the concept side of things as care has been taken to integrate these tools within the context of the concepts and questions under investigation. In this way, the series is unique in that it not only collects protocols but also includes theoretical background information and critiques which led to the methods and their development. Thus, it gives the reader a better understanding of the origin of the techniques and their potential future development. The Neuromethods publishing program strikes a balance between recent and exciting developments like those concerning new animal models of disease, imaging, in vivo methods, and more established techniques, including, for example, immunocytochemistry and electrophysiological technologies. New trainees in neurosciences still need a sound footing in these older methods in order to apply a critical approach to their results.

Under the guidance of its founders, Alan Boulton and Glen Baker, the Neuromethods series has been a success since its first volume published through Humana Press in 1985. The series continues to flourish through many changes over the years. It is now published under the umbrella of Springer Protocols. While methods involving brain research have changed a lot since the series started, the publishing environment and technology have changed even more radically. Neuromethods has the distinct layout and style of the Springer Protocols program, designed specifically for readability and ease of reference in a laboratory setting.

The careful application of methods is potentially the most important step in the process of scientific inquiry. In the past, new methodologies led the way in developing new disciplines in the biological and medical sciences. For example, physiology emerged out of anatomy in the nineteenth century by harnessing new methods based on the newly discovered phenomenon of electricity. Nowadays, the relationships between disciplines and methods are more complex. Methods are now widely shared between disciplines and research areas. New developments in electronic publishing make it possible for scientists that encounter new methods to quickly find sources of information electronically. The design of individual volumes and chapters in this series takes this new access technology into account. Springer Protocols makes it possible to download single protocols separately. In addition, Springer makes its print-on-demand technology available globally. A print copy can therefore be acquired quickly and for a competitive price anywhere in the world.

Saskatoon, SK, Canada Wolfgang Walz

## Preface

Control and monitoring of neuronal activity with light, what is often called all-optical manipulation of neurons, is admittedly the most adequate method for addressing questions regarding communication of neurons in a neural circuit or between different circuits. The big and continuously expanding toolbox of molecular photosensitive probes that activate/ inhibit or image (through membrane voltage or calcium changes) neuronal activity, in combination with the development of original light-microscopy methods for stimulating these probes, has tremendously contributed to this direction and led to innovative experimental concepts, where neurons can be manipulated either as entities or as ensembles. Indeed, the use of light offers suitable spatiotemporal resolution to manipulate neurons at single-cell specificity. At the same time, it gives access to a large population of cells simultaneously via scanless, parallel illumination methods.

Neural circuit studies are more conclusive for addressing biological questions when performed in vivo. In this sense, they necessitate three-dimensional (3D) accessibility both for activation and imaging, at physiological time scales (few-ms scale activation and imaging). 3D imaging approaches enable today using complementary strategies to access volumes extending up to hundreds of micrometers in the axial direction. On the contrary, the development of 3D photoactivation methods is more recent. These systems use computer-generated holography (CGH), a technique based on phase modulation of the excitation beam's wavefront, to create multiple excitation regions of interest. Thanks to 3D-CGH, used either solely (parallel methods) or in its diffraction-limit version in combination with scanning of the holographic beamlets, it is nowadays possible to simultaneously activate multiple neurons providing both the adequate temporal resolution, as well as the spatial resolution for near single-cell precision. High spatial resolution and selectivity is often assured by implementing those methods with two-photon excitation.

Although the first experiments of all-optical manipulation of neuronal activity performed activation of neurons via uncaging of caged glutamate, the term all-optical today is mostly related to the combination of functional imaging and optogenetic activation. There is a growing number of studies using optogenetics and calcium imaging to explore several hypotheses in cellular and systems neuroscience, nevertheless a full optical neuronal control remains a challenge in terms of achieving reliable delivery and expression of sensors and actuators in the same neurons, eliminating the crosstalk between imaging and activation, and recording and stimulating with single-neuron and single-action-potential precision.

In this volume, we opt to give an overview of the methods that have been used so far in all-optical experiments, but also to present other promising approaches potentially useful in this domain. The book is addressed to people experienced in different disciplines, such as physicists, engineers, and neuroscientists; therefore, it starts by providing some basic but fundamental background information in terms of both physiology and optics in the context of all-optical two-photon neurophysiology experiments (Chap. 1), followed by some prompts for the selection of appropriate actuators and sensors, and functional imaging methodologies that drive the choice of both, together with the suitable laser sources for two-photon excitation (Chaps. 1 and 2).

We then present, in detail, optical methods that have been used for photoactivation and imaging. The reader can find the design principles and, in some cases, hardware implementation for methods like generalized phase contrast (Chap. 1), computer-generated holography and scanning approaches (Chaps. 3 and 4), temporal focusing (Chaps. 1 and 4), as well as guidance to the entire workflow for an all-optical experiment in circuit neuroscience (Chap. 5). In Chap. 6, possibilities and limitations of optogenetic actuators are discussed within the context of an all-optical single-beam experiment by giving insights into the photophysical properties of actuators. Detailed methods are provided in Chap. 7 on a miniature head mounted two-photon fiber-coupled microscope for imaging neuronal activity in vivo in freely moving animals, while Chap. 8 discusses how 3D holographic optogenetics can be added to a home-built light sheet microscope.

This book also attributes a part on innovative imaging techniques that could be implemented in the framework of an all-optical electrophysiology experiment. Chapter 9 pronounces the theories of temporal focusing in combination with single-pixel detection for imaging of fast collective biological processes at depth, over a widefield and at high spatiotemporal resolution. Chapter 10 presents alternative imaging methods at synaptic resolution, such as two-photon fluorescence microscopy equipped with Bessel focus scanning technology and widefield fluorescence microscopy with optical sectioning ability.

The implementation of simultaneous two-photon imaging and holographic optogenetics in conjunction with population analytical tools or with psychophysical measurements of evoked synthetic percepts to confirm a precise relationship between optical manipulations and behavior is presented in Chaps. 11 and 12.

Finally, in Chap. 13, an approach for label-free imaging is presented: stimulated Raman scattering (SRS) microscopy, a non-linear imaging method for visualizing a molecule based on its chemical properties, and the way to integrate it in a commercial multiphoton microscope, eventually useful for label-free functional imaging.

The use of all-optical methods for studying the neural function is a multiparametric and arduous project to setup. It entails a multidisciplinary know-how both for developing the optical system and the adequate biological preparation, and sometimes expensive equipment, especially when multiphoton excitation is considered. We hope that this book can serve as a guide to facilitate the first requirement, establishing a useful reference for groups starting their activity in this domain, and give insights on the optical systems, the choice of actuators and sensors, but also stimulate ideas for ground-breaking configurations and experiments.

```
Institut de la Vision
Sorbonne Universite´, INSERM, CNRS, Paris, France
```
Eirini Papagiakoumou

## Contents


x Contents


## Contributors


and Integrated Bioim CA, USA aging Division, Lawrence Berkeley National Laboratory, Berkeley,

GILAD M. LERMAN • Neuroscience Institute, NYU Langone Health, New York, NY, USA


## Chapter 1

## Optical Manipulation and Recording of Neural Activity with Wavefront Engineering

## Ruth R. Sims, Imane Bendifallah, Kris Blanchard, Dimitrii Tanese, Valentina Emiliani, and Eirini Papagiakoumou

#### Abstract

One of the central goals of neuroscience is to decipher the specific contributions of neural mechanisms to different aspects of sensory perception. Since achieving this goal requires tools capable of precisely perturbing and monitoring neural activity across a multitude of spatiotemporal scales, this aim has inspired the innovation of many optical technologies capable of manipulating and recording neural activity in a minimally invasive manner. The interdisciplinary nature of neurophotonics requires a broad knowledge base in order to successfully develop and apply these technologies, and one of the principal aims of this chapter is to provide some basic but fundamental background information in terms of both physiology and optics in the context of all-optical two-photon neurophysiology experiments. Most of this information is expected to be familiar to readers experienced in either domain, but is presented here with the aim of bridging the divide between disciplines in order to enable physicists and engineers to develop useful optical technologies or for neuroscientists to select appropriate tools and apply them to their maximum potential.

The first section of this chapter is dedicated to a brief overview of some basic principles of neural physiology relevant for controlling and recording neuronal activity using light. Then, the selection of appropriate actuators and sensors for manipulating and monitoring particular neural signals is discussed, with particular attention paid to kinetics and sensitivity. Some considerations for minimizing crosstalk in optical neurophysiology experiments are also introduced. Next, an overview of the state-of-the-art optical technologies is provided, including a description of suitable laser sources for two-photon excitation according to particular experimental requirements. Finally, some detailed, technical, information regarding the specific wavefront engineering approaches known as Generalized Phase Contrast (GPC) and temporal focusing is provided.

Key words All-optical neurophysiology, Light shaping, Temporal focusing, Generalized phase contrast, Computer-generated holography, Functional imaging, Optogenetics, Molecular tools, GECIs, GEVIs

#### 1 Introduction

Experiments in modern neuroscience require techniques capable of monitoring ("reading") and manipulating ("writing") neural activity across a staggering range of spatiotemporal scales. For instance,

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_1, © The Author(s) 2023

Fig. 1 Different spatiotemporal scales encountered in all-optical neurophysiology experiments. Relevant spatial (a–d) and temporal (e–h) scales encountered in all-optical neurophysiology experiments. (a) Ion channels and pumps, with nanometer dimensions, residing within the cell membrane are ultimately responsible for the excitability of individual neurons. (b) All-optical neurophysiology experiments aiming for photoactivation with single-cell resolution target the neuronal soma (~10 <sup>μ</sup>m diameter). (c) Neurons distributed within millimeter volumes that display coordinated activity are termed neural ensembles or engrams. The primary goal of a growing number of all-optical neurophysiology experiments is to manipulate these functionally defined circuits. (d) Neural activity governing a particular behavior is commonly distributed across multiple, often non-contiguous, brain regions which can span mesoscale (mm–cm) distances. (e) Neurons are depolarized by excitatory inputs (EPSPs) and hyperpolarized by inhibitory inputs (IPSPs) on timescales of tens of milliseconds [1]. (f) Larger and longer changes in membrane potential are sometimes observed when neurons receive multiple synaptic inputs. (g) Action potentials (APs) are fired when the somatic membrane potential is depolarized beyond threshold (-55 mV). Action potentials invert the membrane potential on millisecond timescales. (h) Individual neurons display characteristic patterns of AP firing. Many all-optical neurophysiology experiments (i) simultaneously monitor the dynamic pattern of AP firing in different neurons or (ii) record the firing response of particular neurons to external stimuli (Si ) in trials before replaying and manipulating these physiological activity patterns using photostimulation and inhibition

ion channels have nanometer dimensions (Fig. 1a) and undergo conformational changes on micro- to millisecond timescales, whereas neuronal circuits in human brains span decimeters (Fig. 1d) and can be refined over the course of a lifetime. Due to the minimally invasive nature of infrared photons in brain tissue, a plethora of optical technologies, based on multiphoton excitation, spanning these spatiotemporal scales have been developed and the toolbox of optical actuators and indicators of neural activity has continuously expanded and evolved. As a result of this rapid multidisciplinary progress, optical activation and inhibition of genetically defined classes of neurons can now be achieved using a variety of light-gated actuators (mainly channelrhodopsins) and neural activity can be detected using highly specific and sensitive fluorescent probes; including calcium and voltage indicators. The field of optogenetics in neuroscience has matured to such an extent that two-photon all-optical experiments can be performed in-vivo, whereby signals from multiple neurons constituting neural circuits distributed across different brain regions can be elicited and recorded optically, with single-cell and sub-millisecond precision [2].

However, since different opsins and reporters can exhibit vastly different photophysical characteristics, it is necessary to optimize all-optical neurophysiology experiments according to the specific requirements of each biological question. All-optical experiments are challenging, and their success relies on the careful selection and co-expression of an appropriate actuator-sensor combination, a suitable photoactivation approach for precise and efficient excitation of the desired population of neurons, and a sufficiently sensitive imaging method capable of recording neural activity without spurious activation of the opsin expressing cells.

This chapter introduces and reviews some of the most important molecular and optical technologies for manipulating and recording neural activity and highlights critical parameters common to most all-optical neurophysiology experiments. Since many of these technologies are described in greater detail in subsequent chapters of this book, we refer the reader to these chapters and instead provide specific technical details for implementing generalized phase contrast (GPC) and temporal focusing. Finally, a detailed protocol for the preparation of mice hippocampal organotypic slices (a commonly used biological preparation) expressing both optical actuators and indicators for all-optical interrogation of neuronal circuits is included.

#### 2 State-of-the-art Technologies for All-Optical Neurophysiology

All-optical neurophysiology experiments require appropriate molecular tools such as light triggered actuators capable of controlling ion fluxes through the cell's membrane and thus the electrical activity of neurons [3–8] and fluorescent probes which provide optical readouts of neural activity [9–14]. A wide variety of molecular tools, exhibiting different photophysical properties, have been discovered and engineered to meet this requirement. This chapter will focus on tools capable of optically manipulating and recording of neuronal activity in scattering tissue, which commonly rely on two-photon excitation (2PE) based on the near-simultaneous absorption of two infrared (IR) photons [15–17]. The necessity of exploiting non-linear optical phenomena such as 2PE for performing spatially localized experiments in scattering tissue is well documented [18]. 2PE laser scanning microscopy (2PE-LSM) by rapid displacement of a tightly focused, pulsed, laser beam using galvanometric mirrors [19] is the gold-standard technique for imaging in turbid biological tissue and has also been applied to manipulate neural activity [20]. However, since in some cases this approach does not provide sufficient temporal resolution, a large number of different methods have been developed for optimized excitation of channelrhodopsins and indicators of neural activity. This section will provide an overview of the photophysical requirements of the molecular tools commonly used in all-optical neurophysiology experiments, before reviewing some state-of-theart sequential and parallel 2PE approaches and evaluating their suitability with respect to exciting and imaging these actuators and indicators.

Useful technologies for all-optical neurophysiology experiments must be capable of eliciting, suppressing, and recording neural activity on physiologically relevant spatiotemporal scales, highlighted in Fig. 1a–h. In order to describe and relate the photophysical properties of molecular tools to specific physiological benchmarks, some relevant properties of single neurons and neural networks will first be reviewed.

Although the extracellular and cytoplasmic environment of any neuron is electrically neutral, the immediate surrounding of plasma membrane (an electrical isolator) has very thin clouds of negative and positive ions that are differentially spread on its inner and outer surfaces (Fig. 1a) [21]. At rest, the inner cytoplasmic surface has an excess of negative charge with respect to the extracellular side. This electrical gradient is actively generated and maintained by the action of the sodium–potassium pump and the presence of passive ion channels (Fig. 1a), which are ultimately responsible for cellular excitability. The difference in charge distribution across the membrane gives rise to a difference in electric potential, the membrane potential (Vm), which for most neurons has a somatic value of around -70 mV (Fig. 1b). During communication via synaptic transmission between connected neurons (Fig. 1c, d), the V<sup>m</sup> of a particular neuron is altered by presynaptic excitatory (depolarizing) and/or inhibitory (hyperpolarizing) inputs (Fig. 1e, f). These perturbations of the resting potential are the so-called post-synaptic potentials (PSPs) and they are processed and integrated by the soma of the cell. If the net sum of multiple input excitatory or inhibitory PSPs, arriving within the membrane time constant, exceeds a threshold value (~ -55 mV), an action potential (spike) is triggered. Action potentials are highly stereotypical electrical signals which re-orient the electric field across the neuronal membrane on millisecond timescales (Fig. 1g). Action potentials typically lead to an elevation of the concentration of cytosolic calcium through voltage-gated calcium channels, which can last an order of magnitude longer than the action potentials themselves. Each spike is communicated to post-synaptic neurons via local and long-range synaptic connections: pre-synaptic neurons release neurotransmitter onto postsynaptic targets, evoking unitary inhibitory or

#### 2.1 Photophysical Properties of Common Molecular Tools Used for All-Optical Neurophysiology

excitatory post-synaptic potentials (uPSPs). Typically, uPSPs are small in amplitude and duration while PSPs resulting from the integration of multiple synaptic inputs have been observed to give rise to longer and larger variations of somatic membrane potential (Fig. 1e, f) [22, 23], and ultimately may result in action potential firing (Fig. 1g). A wide variety of precise patterns and frequencies of action potential firing have been observed, both for individual neurons responding to distinct stimuli and for different types of neurons located in particular brain regions [24] (Fig. 1h). Furthermore, particular patterns of spike firing in many individual neurons has been observed to be closely correlated with changes in the external sensory world [25–27], and observations of highly coordinated patterns of activity in ensembles of different neurons [28, 29] have led to one of the central hypotheses of modern neuroscience: that higher brain function arises from the interactions between interconnected neurons [30] (Fig. 1c, d, h). Elucidating the causal relationship between neural circuits and network function requires methods capable of stimulating and silencing neurons to mimic physiological patterns of network activity. This necessitates the observation and subsequent manipulation of rate and spike timing across an ensemble of neurons with sub-millisecond temporal precision.

Following decades of heroic protein engineering efforts, desired populations of neurons in virtually all genetically tractable model organisms can now be engineered to express photosensitive transmembrane proteins known as channelrhodopsins [31]. Channelrhodopsins are ion channels of microbial origin, which can be excited into current-conducting states upon light absorption [32, 33] (Fig. 2a). The first sets of experiments that demonstrated optical control of neuronal activity using channelrhodopsin were based on the heterologous expression of Channelrhodopsin-2 from Chlamydomonas reinhardtii [33, 36, 37]. Since then, a dizzying number of excitatory and inhibitory opsin variants with different mechanistic and operational properties have been discovered and engineered. While the optimal choice of opsin for a given experiment depends on the specific preparation and biological question, usually opsin variants exhibiting large photocurrents, selectivity for relevant ions, high light sensitivity, appropriate channel kinetics, and spectral compatibility are preferred.

Using light to modulate electrical activity in opsin-expressing neurons generally requires generating photocurrents with sufficient magnitudes, within the membrane time constant, to depolarize or hyperpolarize the cells and evoke or inhibit action potentials, although there are interesting and notable exceptions [38]. The precise magnitude of photocurrent necessary to evoke or inhibit spikes depends on the biophysical properties of the membrane such as input resistance, capacitance, and action potential threshold, which can vary significantly between neurons. Furthermore, the

Fig. 2 Two-photon characterization of channelrhodopsins. (a) In the simplest conceptual model of the opsin photocycle, ion channels reside in the closed state. Upon light absorption, the channels open, allowing the exchange of ions between the cytosol and extracellular space. Depending on the ion selectivity of the channel and the electrochemical gradient, this flow of ions will hyperpolarize or depolarize the cell membrane and ultimately inhibit or excite the cell. (b) In reality, the opsin photocycle is more complex, but can reasonably be approximated by the so-called four-state model. For more details refer to [34, 35]. (c) Opsins can be characterized using whole-cell voltage patch clamp to measure the currents that flow across the cell membrane under different conditions. Inset: visualization of a characteristic 12-μm diameter holographic spot, typically used for parallel 2P- photoactivation. (d) Photocurrent traces recorded in whole cell voltage patch clamp from CHO (Chinese Hamster Ovary) cells expressing ChRmine as a function of increasing 2P excitation power (920 nm, 12-μm diameter excitation spot, 200 ms continuous illumination, incident powers varied between 0 and 50 mW as indicated in the color bar). The characteristic features of the photocurrent traces (kinetics and peak/stationary photocurrent) are labeled. The magnitude of the photocurrent increases with power density to saturation. (e) 2P-LSM image of AAV9-CaMKIIa-somBiPOLES-mCerulean expressed in hippocampal organotypic slice cultures by bulk infection (scale bar represents 50 <sup>μ</sup>m). (f) Photostimulation (upper, 1100-nm illumination, 0.44 mW/μm<sup>2</sup> , 5 ms continuous illumination (red bar)) and inhibition (lower, 920-nm illumination, 0.3 mW/μm<sup>2</sup> , 200 ms illumination during constant current injection (gray bar)) of a single neuron expressing somBiPOLES with a 12-μm diameter holographic spot

absolute photocurrent magnitude that can be generated in a given neuron itself depends on many factors – including the specific properties of the opsin, the degree of expression, the efficiency of membrane targeting, and the photostimulation modality (for instance, single- or multi-photon excitation). Excitatory or inhibitory effects can be elicited by expressing different sub-classes of opsins with specific ion selectivity (for instance, sodium or protons [ for excitation and chloride or potassium for inhibition) ]. Since the single-channel conductance of most opsins (~50 fS) is three to four orders of magnitude smaller than that of ion channels endogenously expressed in neurons (~100 pS) [ ], optical control of neuronal activity relies on the expression and subsequent excitation of sufficient opsin molecules distributed over an extended region of the cell membrane. This consideration is essential in the case of 2P excitation which is 32, 43, 44 [41, 42 39, 40]

intrinsically spatially confined. Hence, one of the first challenges in optogenetics experiments is achieving reliable, homogeneous, and functional expression of the desired opsin in the membranes of target neurons. As a result of intensive protein engineering efforts, several successful strategies such as codon optimization [45, 46] and membrane trafficking sequence optimization [47] have been developed which enable sufficiently high functional expression levels (~105 opsin molecules per neuron) without detectably perturbing membrane physiology [48]. In particular, soma-targeted opsin variants which utilize the c-terminal targeting motif from the soma localized potassium channel Kv2.1 have exhibited improved membrane localization and enhanced photocurrents [49]. Additionally, variants of inhibitory opsins with similar soma targeting sequences have been demonstrated to result in fewer antidromic spikes [42]. The use of soma-targeted opsins has also been demonstrated to significantly reduce off-target photoactivation [6, 7, 50], which is a crucial consideration for certain applications.

At physiological membrane potentials, exposing channelrhodopsin expressing neurons to the light of an appropriate wavelength causes the light-gated ion channels to open. This allows the passage of specific ions across the cell membrane (according to their electrochemical gradients) and generates photocurrents (I) that can modulate neuronal activity (Fig. 2a, b). As highlighted previously, enhancing or suppressing neural activity using optogenetics requires the excitation of a sufficient number of opsins within the membrane time constant to induce adequate depolarization (or hyperpolarization) of the soma to cause (or prevent) the opening of voltage-gated ion channels. As a result, it is the macroscopic photocurrent parameters that emerge due to the combined action of functional, membrane-localized, channels that are relevant for all-optical neurophysiology experiments and will be discussed throughout this section.

It is possible to quantify and characterize these macroscopic photocurrent parameters using electrophysiology, specifically, using whole-cell voltage clamp (Fig. 2c). For example, in response to continuous illumination, the photocurrent of a channelrhodopsin expressing neuron exhibits a characteristic profile with three main features: (i) an initial peak (Ip) which decays to reach, (ii) a steady state, (stationary) plateau (Is), and finally (iii) a return to baseline in the absence of light. Representative photocurrent traces are plotted for increasing illumination power in Fig. 2d with these main features highlighted. The transitions between these features of the macroscopic photocurrent are commonly parametrized by time constants τon, τin, and τoff for indicating respectively the time it takes for the photocurrent to reach the peak when the channels open, the time for inactivation, and the time to reach zero when the channels close, which typically exhibit millisecond values [47] but vary between channelrhodopsins and can also depend on the intrinsic membrane properties of the cell. The functional profile of this macroscopic photocurrent has extremely important implications for 2P optogenetics experiments since it ultimately dictates the optimum illumination strategy and imposes bounds on temporal resolution, temporal precision (jitter), and spiking rate [48]. Different characteristics of the macroscopic photocurrent are relevant for different paradigms of optogenetic photostimulation. For instance, opsins with fast "off" kinetics are critical for applications which necessitate the induction of spike trains with high temporal fidelity (e.g., sub-millisecond precision) and high firing frequencies [49]. Inducing cell depolarization at faster rates than the kinetics permit can cause prolonged depolarization to the so-called plateau potential, induced by excessive cation influx, which can induce non-uniformity in neuron responses to identical light pulses and, in some cases, cessation of action potential firing [51]. However, it is important to note that opsins with faster τoff kinetics generally require higher light intensities to reach action potential threshold, which might be an important consideration in experiments aiming to simultaneously photostimulate large numbers of neurons [52] where the power for photoactivation must be divided between targets. On the other hand, to reliably inhibit action potential firing during a prolonged interval, the macroscopic photocurrent must exhibit a high steady-state to peak <sup>I</sup> <sup>s</sup>=I p ratio and high conductivity of anions throughout the entire photocycle. Influencing neural activity over extremely long periods of time without causing photodamage, for instance to sensitize entire neuronal networks to native activity patterns, benefits from the use of a class of opsins with exceptionally slow kinetics known as step function opsins (SFOs). SFOs can be photoactivated using a single, low intensity light pulse, remain in the "open" state for extended timescales (minutes) and can often be closed using a second pulse of light at a different wavelength [53]. In conclusion, the photocycle kinetics, sensitivity, selectivity, and photocurrent magnitude are commonly the primary considerations when selecting an appropriate opsin for a particular all-optical neurophysiology experiment. Having selected and successfully expressed the channelrhodopsin, the intensity and duration of delivered light must be titrated until the desired neuronal response is reliably elicited. To achieve inhibition and excitation of the same neurons during the same experiment, with different excitation wavelengths, bicistronic constructs such as BiPOLES [54] can be used (Fig. 2e, f). Such constructs are constituted of excitatory and inhibitory channelrhodopsins expressed in tandem for precise stoichiometry.

In all-optical neurophysiology experiments, photostimulation is performed alongside functional imaging, both in order to identify the specific set of neurons to target according to their activity patterns (in response to a particular stimulus) and also to observe how the induced patterns of neural activity affect cellular or

Fig. 3 Calcium and voltage indicators as reporters of neuronal activity. (a) Cytosolic calcium concentration increases temporarily as a result of the change in membrane potential that occurs during an action potential. The intensity-based fluorescent probe GCaMP binds to calcium. This alters the conformation of the circularly permuted GFP chromophore and results in an increase in fluorescence intensity. (b) Left: 2P-LSM image of AAV9-Syn-jGCaMP7s expressed in hippocampal organotypic slice cultures by bulk viral infection (scale bar represents 25 <sup>μ</sup>m). Right: fluorescent jGCaMP7s traces in response to trains of action potentials (5, 15, 40 Hz) evoked by pulsed current injection into a patched neuron (indicated below). (c) In the case of voltage-sensing domain (VSD)-based voltage indicators, a change in membrane potential causes a change in conformation of the VSD, which is covalently linked to a circularly permuted fluorophore. The change in conformation of the fluorophore typically results in a decrease in fluorescence intensity. (d) Left: 2P-LSM image of AAV8-hSyn-ASAP3b expressed in hippocampal organotypic slice cultures by bulk viral infection (scale bar represents 25 <sup>μ</sup>m). Right: simulated ASAP3b traces in response to trains of action potentials (5, 15, 40 Hz) as in (b)

network function. Fluorescent reporters sensitive to changes in many different aspects of neuron physiology have been developed, but those responsive to action potentials, such as calcium and voltage indicators, are the most widely used.

Calcium imaging using fluorescent protein sensors has proved particularly useful for all-optical neurophysiology since the activity of large numbers of neurons can be recorded simultaneously [55– 60]; the same group of neurons can be imaged during extended time periods and can also be compared across different recording sessions. The allure of calcium imaging stems, in part, from the photophysical properties of the optical signal. In mammalian neurons, spiking activity results in a temporary increase of Ca 2+ concentration throughout the soma via voltage-gated Ca 2+ channels, which open as a result of the change in membrane potential during the action potential (Fig. 3a). This somatic Ca 2+ influx may also be amplified by calcium release from intracellular stores [61, 62]. As such, a vast number of freely diffusing calcium indicators distributed throughout the cytosolic volume can collectively report on the occurrence of action potentials. Although action potentials only last a few milliseconds, the duration of the calcium elevation lasts approximately 2 orders of magnitude longer, resulting in a bright, slowly decaying fluorescent signal that can readily be detected with high signal-to-noise ratio (SNR) as illustrated in Fig. 3b. The GFP-based GCaMP family of genetically encoded calcium indicators (GECIs) is used most commonly in all-optical neurophysiology. Multiple rounds of mutagenesis have yielded the latest suite of variants (jGCaMP8) which exhibit different properties optimized for particular applications [63]. While calcium imaging is the most commonly used approach for imaging the activity of large neural populations, the potential pitfalls associated with using a second messenger that exhibits slow kinetics are also widely acknowledged and must be considered [64, 65].

Voltage indicators generate optical signals with magnitudes proportional to changes in membrane potential (Fig. 3c) and can be used to provide a readout of precise action potential timing in addition to sub-threshold depolarizations and hyperpolarizations. At present, genetically encoded voltage indicators (GEVIs) may broadly be divided into three categories: rhodopsin-based indicators [66, 67], hybrid chemogenetic indicators [13], and sensors based on the fusion of a fluorophore to a voltage-sensing domain (VSD) [68, 69], though only the latter category of GEVIs have been demonstrated to be compatible with 2P excitation [70]. Calcium imaging is a much more prevalent technique than voltage imaging, since optically monitoring changes in membrane potential is fundamentally more challenging in terms of signal detection. Firstly, only voltage-sensitive reporters located within a Debye length can report on the membrane potential, and improperly localized GEVIs reduce the sensitivity of optical measurements of membrane potential by increasing background fluorescence. Similarly, as for channelrhodopsins, it has been demonstrated that fusing GEVIs with soma localization motifs improves membrane trafficking and reduces off-target intracellular labeling. While a typical neural soma constitutes around 60% of the entire cell volume, the somatic membrane only accounts for 2–7% of the total cell surface area [71, 72]. As a result, the number of voltage indicators that can report on the membrane potential is less than 0.1% of the number of Ca 2+ indicators in the cytosol [73, 74], which places an upper bound on the signal-to-noise ratio of voltage imaging (Fig. 3d) [75]. This is compounded by the fact that action potentials occur on much shorter timescales than the consequent calcium signal, and hence voltage imaging requires much faster sampling rates (>500 Hz and in many cases > kHz, depending on the specific application). Raster scanning is an inefficient approach for detecting membrane-localized signals which account for a small fraction of the field of view (FOV) – and the resulting frame rates are insufficient for population-level voltage imaging. The unifying feature of different approaches optimized for 2P voltage imaging is an increased illumination duty cycle of signal-generating pixels. Such increases in temporal resolution are often achieved at the cost of increased photobleaching, which is compounded by the fact that voltage indicators are replenished slower than calcium indicators because diffusion is much slower in the membrane lipid bilayer than in the cytoplasm, necessitating the use of more robust fluorophores [76]. Additionally, sample motion is more problematic for voltage imaging. Though population voltage imaging is technically more challenging than calcium imaging, it has the potential to provide more physiologically relevant information about the logic and syntax of the neural code and, indeed, is necessary for a subset of all-optical neurophysiology experiments.

Compatible actuators and indicators must be carefully selected in order to simultaneously and independently monitor and control neural activity in a single preparation. Firstly, the fluorophore used to aide visualization of opsin-positive cells should generally be spectrally separate from both the opsin and the activity reporter and, should be chosen so as not to occupy precious spectral bandwidths. This is a particularly important consideration in the case of voltage imaging, where any bleed-through, activity-independent, fluorescence degrades precious signal-to-noise ratio and ultimately reduces the detectability of neuronal signals. Most crucially, all-optical experiments generally benefit from employing spectrally orthogonal opsins and activity reporters. Spurious activation of opsin-positive neurons while imaging neural activity can perturb neural networks by altering excitability and inducing changes in synaptic release and plasticity [77]. Imaging artifacts can also be induced due to the excitation of activity reporters during opsin photoactivation, though this is typically less severe since network function is not affected and, ordinarily, these artifacts can be minimized by precisely de-synchronizing photostimulation and imaging (possible at low frame rates such as those used for calcium imaging) or removed during subsequent analysis. Hence, the term "optical crosstalk" is commonly used to describe artefactual photostimulation, induced during imaging in all-optical neurophysiology experiments (for a much more detailed discussion regarding crosstalk during all-optical neurophysiology experiments refer to Chaps. 2, 4 and 5).

Although channelrhodopsin variants with peak single-photon (1P) excitation wavelengths spanning the visible region of the electromagnetic spectrum have been engineered [39, 78], performing crosstalk-free, multi-color experiments is not trivial. Evidently, variants of actuators and reporters from opposing ends of the spectral palette should be chosen. Unfortunately, the action spectra of channelrhodopsins commonly used for 2P optogenetics are typically extremely broad [39]. Furthermore, so-called, red- shifted opsins exhibit persistent "blue tails", which coincide with

2.2 Combining Molecular Tools for All-Optical Neurophysiology Experiments

wavelengths used for 2P imaging of activity reporters (920–950 nm). A number of different approaches aiming to alleviate this problem have been proposed (see also Chap. 2). Very recently, implementation of spectrally independent excitation beams enabled artifact-free all-optical experiments with GCaMP and red-shifted channelrhodopsins (see also Chap. 4) [79]. Parallel excitation methods have taken advantage of the different sub-cellular distributions of GECIs and opsins [80], though of course this is less applicable in the case of voltage imaging (where both indicator and actuator are membrane localized), and further is not intrinsically robust to sample motion which is problematic for in-vivo applications. An alternative approach is to employ blueshifted opsins in combination with red-shifted reporters [50, 81]. One benefit of this is that longer wavelength fluorescent photons exhibit longer scattering lengths in biological tissue which should facilitate deeper imaging. While this approach has found success for 1P excitation [67], the two-photon counterpart of this approach has thus far been limited. On the one hand, genetically encoded, red-shifted activity indicators display lower 2P efficacies than green ones and, furthermore, amplified lasers in the spectral region adequate for photostimulating several cells expressing blueshifted opsins (920–950 nm) have only recently become available [81]. Another approach to minimize crosstalk is to use opsins with fast kinetics and optimize the raster-scanned trajectory used to image GCaMP activity to minimize the accumulation of photocurrent during the membrane time constant. Although this method does not eliminate sub-threshold network perturbation, the (relatively) fast repolarization of neurons expressing opsins with short τoff values means they are unlikely to fire due to depolarization induced by the scanned imaging beam. Of course, successful employment of this method requires careful titration of different imaging conditions, including imaging power, frame rate, and field of view as an interim approach until high efficacy blue-shifted opsins, red-shifted activity indicators [82], and amplified lasers in the appropriate spectral range are developed.

A final subtle point to note when combining actuators and indicators in all-optical neurophysiology experiments is that sustained opsin activation can alter the conditions of the intra- and extracellular environment [83], which could impact the behavior of the opsin, the excitability of the neuron, and also the fluorescent yield of the activity reporter [84], while long-term effects such as changing chloride concentration could influence the entire network. Each of these factors should be considered when drawing conclusions about neural activity based on fluctuations in the fluorescent signal.

2.2.1 Expressing Molecular Tools in Specific Populations of Neurons for All-Optical Neurophysiology

Experiments

To perform all-optical neurophysiology experiments, neurons must be genetically modified in order to induce the expression of actuators and indicators in specific populations of neurons, typically via promoter-operating expression specificity. Examples of ubiquitous promoters that can be used to drive expression of actuators and indicators in a broad set of neurons and that are strongly and persistently active in a wide range of cells are the hSyn (human synapsin) promoter and the synthetic mammalian-specific promoter CAG. A variety of approaches exist for gene delivery based on the molecular signatures, projection patterns, anatomical organization, and functional activity of neurons [85]. Viral approaches, electroporation, and constitutive expression in transgenic animals have all been utilized. The most commonly used strategy to date is viral transduction. Viral vectors can be delivered directly to specific brain regions using stereotaxic, intracranial injections, yielding long-term expression and high transgene levels which is especially important in the case of promoters with low transcriptional activity [86]. The degree of viral spread (and hence transgene expression) from the injection site varies with both virus serotype and tissue type [87]. In general, for rodent brains, opsin gene expression reaches functional levels within 3 weeks after adeno-associated virus (AAV) injection. Another approach, single-cell electroporation, provides a much greater degree of control of protein expression patterns than viral transduction and can be used to deliver longer segments of DNA. Using electroporation, an exact set of neurons can be transfected with precise amounts of a single plasmid or with mixtures of plasmids with well-defined ratios [88]. Alternatively, specific cortical layers can be targeted with in utero electroporation [89]. Transgenic animals are also invaluable for all-optical experiments but can be expensive and time-consuming to generate. Before establishing transgenic lines, it is important to test, characterize, and calibrate appropriate optogenetic actuators and reporters. In vitro dissociated cell cultures represent an important tool for characterizing actuators and indicators in single homogeneous cell populations. However, because the brain's architecture is lost in the culture process, they are not suitable for studying brain function [90]. Organotypic cultures are becoming a favored preparation for testing new preparations for all-optical neurophysiology experiments (such as new actuator/indicator combinations), since the main network architecture is maintained (Sect. 3.4; Fig. 9d), and it is possible to test many different conditions per animal (10–15 in the case of hippocampal organotypic cultures). A protocol used to produce hippocampal organotypic cultures and perform bulk viral infection is presented in Sect. 3.4 of this chapter. In Fig. 4 we show an all-optical experiment in mice hippocampal organotypic slices, co-expressing the soma-targeted cation channelrhodopsin ST-ChroME and the genetically encoded Ca 2+ indicator GCaMP7s (Fig. 4a). Neurons were photostimulated using two-photon

Fig. 4 All-optical electrophysiology in mice hippocampal organotypic slices. (a) Two-photon fluorescence image showing the co-expression of the high performance and soma-targeted cation channelrhodopsin ST-ChroME, here tagged to the chromophore mRuby3 (red colour corresponds to the nuclear localization of mRuby3 reporter) and the genetically encoded Ca2+ indicator GCaMP7s (green) in the CA3 region of a hippocampal organotypic slice. White circles represent the two-photon temporally focused spots delivered to excite 50 different neurons (12 <sup>μ</sup>m spot diameter, 1040 nm wavelength, 0.26 mW/<sup>μ</sup>m2 incident power). Scale bar: 50 <sup>μ</sup>m. (b) Two-photon imaging of GCaMP7s fluorescence signals evoked by the sequential stimulation of the cells (interstimulus interval ~3 s). Gray bars represent the stimulation protocol which consisted of a train of 5 pulses of 5-ms duration at 4 Hz. The identity of the cells during the sequential stimulation is denoted by the blue numbers on top. In this experiment, 28 out of 50 cells yielded calcium transients in response to stimulation (green horizontal arrowheads). During the acquisition time (~160 s) two synchronous network-wide bursting events were observed (vertical arrowheads at the bottom), the first one seemed to be triggered by the direct activation of a hub-like cell (cell number 15 in the sequence; see pink inset), while the second network-wide event seemed to be triggered by the spontaneous activation of a hub-like cell in the circuit. Pink and orange arrowheads denote the evoked or spontaneous nature of the events, respectively. A single event (in only 1 neuron) with similar characteristics to the network-wide bursting events in terms of amplitude and kinetics was observed near the end of the acquisition time (horizontal orange arrowhead). The large amplitude of these events reflects the large number and/or frequency of action potential firing in comparison to the fine-tuned control of firing activity evoked by single-cell resolution and sub-millisecond precision patterned photostimulation as it is observed in the inset in (c)

excitation with temporally focused 12-μm diameter holographic spots, and their responses were detected by imaging GCaMP using 2P scanning imaging on a standard galvanometric-based setup. 28 of 50 cells yielded calcium transients in response to photostimulation (Fig. green horizontal arrowheads). During the experiment (~160 s) two synchronous network-wide bursting 4b,

events were observed (Fig. 4b, vertical arrowheads at the bottom), the first triggered by the direct activation of a hub-like cell (Fig. 4b, cell 15; pink arrowheads and inset) and the second a possible spontaneous event (Fig. 4b, orange arrowhead at the bottom). These events are typically seen in developing hippocampal networks [91], and demonstrate that network function is maintained in organotypic slices. Moreover, the large amplitude of these events reflects the large number and/or frequency of action potential firing in comparison to the fine-tuned control of action potentials evoked by single-cell resolution and sub-millisecond precision of patterned photostimulation, as evidenced by the inset shown in Fig. 4c.

An extraordinary number of different 2PE technologies have been developed to precisely control neuronal activity using microbial channelrhodopsins and to provide high-fidelity readouts of activity with calcium and voltage indicators. In this chapter, these methods will be broadly categorized as either sequential or parallel methods. While in sequential-2PE a tightly focused beam visits distinct voxels consecutively, parallel-2PE encompasses all methods in which 2PE occurs within a region larger than the diffraction-limited volume.

A wide variety of components capable of rapidly varying the three-dimensional position of a tightly focused beam throughout a volume of interest have been incorporated into 2PE-LSM instruments to increase the temporal resolution of sequential, pointscanned, 2PE. This includes devices such as resonant galvanometric mirrors [92], rotating polygon mirrors [93, 94], acousto-optic deflectors (AOD) [95–99], deformable mirrors [100], spatial light modulators (SLMs) [101–104], piezoelectric scanners [105], microelectromechanical systems (MEMS) scanners [106], electrically-tunable lenses (ETL) [107], voice-coils [55, 108], and tunable acoustic gradient (TAG) lenses [109]. Other interesting approaches specifically designed to improve volumetric imaging rates rely on the conversion of lateral beam deflections, typically using galvanometric mirrors, into axial displacements at kilohertz rates [110, 111]. Furthermore, in general, the temporal resolution of sequentially scanned-2PE approaches can be improved by optimizing the scan trajectory according to a pre-defined region of interest (ROI) (e.g., Lissajous scanning [105]).

While successful single-cell optogenetic activation using scanning-2PE based on galvanometric mirrors has been demonstrated [3, 112, 113], photostimulation based on pure sequential scanning is incompatible for use with channelrhodopsins with fast kinetics since a large portion of the somal membrane of each neuron must be scanned before the channels begin to close in order to integrate sufficient photocurrent and successfully reach the threshold for action potential firing. Purely sequential rasterscanned-2PE approaches are not capable of high fidelity,

2.3 State-of-the-art Two-Photon Excitation Approaches for All-Optical Neurophysiology

co-incident excitation of multiple neurons [102, 114, 115]. Similarly, sequentially scanned-2PE methods have only demonstrated sufficient temporal resolution for voltage imaging by extreme reductions of the field of view to a single line [116] or point [117].

The acquisition rates of scanning-2PE systems can be increased by random-access approaches. These techniques use multiple AODs to rapidly deflect a tightly focused beam to a set of pre-defined three-dimensional locations [98, 118]. Random-access scanning has been successfully applied to both calcium and voltage imaging of up to 20 distinct three-dimensional positions at kilohertz sampling rates [14, 119, 120]. In principle, it is possible to achieve denser spatial sampling than has been demonstrated by random-access scanning; the fundamental limit for unambiguous signal assignment in fluorescence microscopy is the fluorescence lifetime (~ns) [121]. Spatiotemporal multiplexing methods aiming to approach this upper bound have been successfully applied to ultrafast recording of neural activity with calcium and voltage indicators [122, 123]. Furthermore, since the lifetime of common fluorophores is shorter than the pulse separation of common mode-locked lasers used for 2PE, single pulses can be divided into multiple beamlets (diffraction-limited spots), each of which can be laterally or axially displaced to illuminate distinct sample regions at different (although, in some cases, almost simultaneous) times. Fluorescence signals sequentially excited by different beamlets can be de-multiplexed by accurate synchronization of the beam displacement approach with the detector using high-speed electronics [60, 108, 124]. Neglecting scattering, spatiotemporally multiplexed fluorescence from different locations can be unambiguously assigned to its origin provided that the effective dwell-time is longer than the excited state lifetime of the fluorophore.

An alternative approach to increase temporal resolution is to modulate the electromagnetic field and increase the instantaneous volume of excitation using so-called parallel methods. Since the inception of laser scanning microscopy, efforts have been made to increase the extent of the excitation beam and hence reduce the dimensionality of the raster scan required to fully sample the region of interest. For instance, voltage imaging at rates of 15 kHz has been demonstrated by rapidly scanning holographically generated foci using AODs to simultaneously excite large membrane areas [14]. More common variants of this approach, such as linescanning, increase the excitation extent in a single direction and capture two-dimensional images by scanning in the transverse direction [125]. Widefield temporal focusing takes this concept to its theoretical limit by performing line scanning at the speed of light [126, 127]. Line-scanned tomography has also been used to achieve millisecond-resolved recordings of voltage and calcium indicators [128]. The dimensionality of the excitation beam has also been increased axially to form Bessel and Airy beams [129– 132], for volumetric imaging based on lateral scanning (See also Chap. 10). In many cases, elongated foci result in the projection of axial information onto a two-dimensional recording, which can limit its applicability to sparsely labeled samples. This can be overcome by spatial multiplexing to record stereoscopic information [133].

Moving from one-dimension scanning approaches toward scanless configurations, one class of parallel-2PE approaches use phase modulation to spatially multiplex the excitation beam and simultaneously project multiple foci in three-dimensions to spatially separated sample regions. For instance, spatial light modulators (SLMs) have been used to deflect beamlets to different threedimensional sample positions through Computer-Generated Holography (CGH) [134, 135] and perform both photostimulation and imaging [80, 104, 136, 137]. The number of beamlets and their position can be dynamically updated up to the SLM refresh rate (~420 Hz for the latest SLM models). Recent innovations such as the combination of overdrive with phase reduction [138], or the sequential illumination of two SLMs [8] have achieved refresh rates in the kHz-range. SLM-based spatially multiplexed calcium imaging has been combined with both single pixel [80] and camera detection [139]. Furthermore, calcium imaging at 1 kHz acquisition rates has been demonstrated by using a microlens array rather than an SLM to generate a grid of beamlets [140]. A common approach for 2P photostimulation combines SLM-based multiplexing with a pair of galvanometric mirrors which laterally sweep each focus in a spiral motion spanning the average soma diameter [8, 102, 103, 115, 141–144]. This method can simultaneously excite large ensembles of neurons without compromising temporal resolution with respect to the single-cell spiral scanning case (see also Chap. 3). Similarly, as for purely sequential-2PE, the temporal resolution of these hybrid parallel-sequential methods can be improved by upgrading the component responsible for sequential scanning.

Another category of parallel-2PE approaches uses phase modulation to increase the lateral extent of the excitation beam and perform scanless excitation [145]. For instance, CGH using SLMs can be also used to sculpt light into arbitrary shapes. This is generally combined with temporal focusing [146, 147] to preserve axial resolution, which scales linearly with lateral extent for holographic beams and quadratically for loosely focused quasi-Gaussian beams [148]. Techniques for distributing temporally focused light throughout a three-dimensional volume have been developed [114, 149, 150] and low-numerical aperture (NA) temporally focused Gaussian beams [113, 151], CGH, and generalized phase contrast (GPC) have all been applied to photostimulation and imaging [152–158]. Since parallel (scanless) 2PE methods can simultaneously excite opsins distributed throughout the soma high photocurrents can be efficiently evoked independently of the off kinetics, which facilitates control of neuronal activity with sub-millisecond jitter [158]. Moreover, in contrast to scanning approaches, in parallel approaches, the temporal resolution of the activation process is solely defined by the dwell time of the physiological process, i.e., the necessary time for the beam to remain on-site for evoking the desired physiological effect.

Having excited an indicator of neural activity using one of the methods outlined above, the next challenge is to detect fluorescent emission. Unfortunately, popular calcium and voltage reporters fluoresce in the visible region of the electromagnetic spectrum, although development of activity reporters fluorescent in the infrared (IR) is an active area of research [159–161]. Thus, visible photons emitted from fluorophores located deep in scattering tissue will typically experience multiple scattering events prior to detection. This is least problematic for sequentially scanned-2PE methods since all collected fluorescence can reasonably be assumed to have been generated by ballistic photons at the focal region. Hence any signal recorded at a given time can be correctly assigned to the correct spatial location (again provided that the dwell time is longer than the fluorescence lifetime). 2PE imaging methods which record fluorescence from different voxels simultaneously are typically less robust against scattering. Beyond a few scattering lengths, the origin of fluorescent photons becomes ambiguous, which limits the depth of spatially multiplexed methods. Crosstalk can be reduced by increasing the spatial separation between excitation foci, but this is achieved at the cost of maximum acquisition rate for full-frame scanning [140]. Computational methods have also been developed to overcome scattering-induced ambiguity by exploiting priors such as high-resolution spatial maps [144, 162], temporal signatures [163–165], or adaptive optics [166– 168]. Finally, to correctly identify signals from different neurons excited using three-dimensional, spatially multiplexed methods, the effective depth of field of the detection axis must be extended with respect to the widefield case. Common extended DOF approaches include multi-focal plane microscopy [169] and point spread function engineering [170], which encodes information about axial position as lateral changes in intensity.

In spite of the number of technological developments outlined in this section, many 2P all-optical optogenetic studies performed to date have used parallel excitation via CGH (either extended holographic spots or spiral scanning) for photoactivation and galvanometric scanners (both resonant and not), occasionally combined with an ETL for calcium imaging across multiple axial planes [8, 103, 115, 141–143, 154, 156, 157, 171]. These studies have already provided novel insights into the principles of neural coding, and it is anticipated that the wider adoption of newer technologies will enable further progress. To assist in this dissemination, the next section will provide specific details about: laser sources required for all-optical neurophysiology experiments, the implementation of Generalized Phase Contrast and Temporal Focusing, and a protocol for preparing hippocampal organotypic slices.

#### 3 Implementation of Methods

3.1 Laser Sources The feasibility of two-photon all-optical neurophysiology projects is largely contingent on the first element in the optical path, the laser, which ultimately dictates experimental parameters such as the number of neurons that can be probed simultaneously, the maximal speed of interrogation, and which probes can be excited (according to their action spectrum). This section will provide a general review of the different laser characteristics that impact the efficiency of two-photon excitation and describe how the choice of laser can be optimized based on specific experimental parameters.

> To review, two-photon excitation occurs when two photons, with sufficient combined energy, are absorbed quasi-simultaneously and a molecule is excited into a higher energy level [172]. The number of photons absorbed per molecule, per unit time, via two-photon absorption (N2P) is proportional to the two-photoncross section (σ2P) and to the square of the instantaneous intensity (N2<sup>P</sup> / < I(t) 2 >). The low values of typical 2PE cross-sections necessitate the use of high time-averaged photon fluxes to excite actuators and indicators at sufficient rates. This can be achieved using mode-locked lasers which generate femtosecond (fs) pulses of light. It is intuitive that, at a given average power, shorter pulses and fewer pulses per unit time result in a greater concentration of photons, which ultimately leads to a higher probability of quasicoincident two-photon absorption. More formally, the concentration of photons in time can be parametrized according to the laser duty cycle which is defined as the product of repetition rate (frep) and pulse duration (τpulse) and corresponds to the fraction of time per unit interval during which there is irradiance. Prior to saturation, and at a given average power, the rate of two-photon absorption is higher for pulsed lasers as compared with their continuous wave (CW) counterparts by a factor proportional to the inverse duty cycle:

$$1 < N\_{2P} > \infty < I(t)^2 > \infty \, \frac{\mathcal{g}\_{\text{p}} < I(t) >^2}{f\_{\text{rccp}} \tau\_{\text{pulsc}}}$$

where g<sup>p</sup> ~0.558–0.664 [173] is a unitless factor which accounts for the fact that real pulses emitted from mode-locked lasers are not rectangular.

In fact, the wide adoption of 2P-LSM was aided by the development and commercialization of reliable, mode-locked lasers which provided enough energy to achieve sufficient rates of two-photon excitation of common fluorophores [174–176]. Ti: Sapphire oscillators exhibiting 100 fs pulse widths and 80 MHz repetition rates (12.5 ns pulse separation) have become the workhorses of sequential 2P-LSM since these lasers provide an ~100,000-fold increase in the rate of two-photon excitation as compared with CW excitation at the same average power, allowing 2P-LSM imaging to be performed using much more palatable average powers (milliwatts in comparison to kilowatts). However, these lasers no longer represent the gold standard for all-optical neurophysiology experiments, particularly those in which multiple neurons are probed simultaneously. The larger instantaneous extent of the excitation area in parallel methods, or the division of the original laser beam into a certain number of beamlets for parallel spiral scanning necessitates the use of much higher peak pulse intensities. Two obvious strategies for increasing the pulse energy while maintaining average power are decreasing the pulse width or repetition rate. In practice, some reduction of the pulse width below the standard 100 fs value is possible [177, 178], provided that the spectral width remains narrower than the action spectra of the actuators and indicators (to maintain excitation efficiency). However, this approach requires careful dispersion management, particularly when elements such as SLMs and diffraction gratings are employed in the optical path. Much larger gains can be achieved using amplified lasers with low repetition rates. Ytterbium-doped fiber lasers with central wavelengths in the region of 1030–1040 nm [179] are now commonly used for in-vivo imaging and photostimulation, offering instantaneous powers that are orders of magnitude higher than conventional tunable lasers. The use of Ytterbium-doped fiber amplifiers with microjoule pulse energies is necessary in order to simultaneously photostimulate neural ensembles composed of tens of neurons [5, 6, 103, 143]. Nevertheless, since these systems emit light at fixed wavelengths, the choice of opsin is constrained and multiple lasers with different wavelengths must be used to excite different sensors and actuators. Solutions that offer greater flexibility in terms of wavelength while delivering high energy (microjoule) pulses can be found in systems using optical parametric amplification (OPA) for the generation of the excitation beam [79, 81].

When probing biological preparations with such high irradiances (which can often exceed 1024 photons cm-<sup>2</sup> s -1 ) it is of course necessary to consider the possibility of physiological perturbations. Photoperturbations based on linear absorption processes (N1P / < I(t)>), such as heating (via single-photon absorption) or optical trapping [180, 181], occur throughout the excitation beam while higher-order processes (NnP / < I(t) n >), for instance, photolysis, ablation, and optical breakdown [182–184], are confined to the focal region. This is particularly important to consider when choosing the appropriate excitation approach for photostimulation [185]: parallel methods generally use lower power density than spiral scanning but higher average powers. Since the optimum excitation parameters and signal to photoperturbation ratio are likely to be highly dependent on the specific characteristics of the sample preparation, it is advisable to vary the repetition rate, pulse width, and average power in each case if possible [186]. The optimal excitation parameters are likely to be different for different excitation modalities.

As outlined in Sect. 2.3, many parallel two-photon excitation approaches rely on lateral beam sculpting. A correspondingly wide variety of methods based on amplitude or phase modulation have been conceived of and demonstrated experimentally. Phase modulation is generally preferable since it is more power efficient than amplitude modulation. Computer-generated holography (CGH) is currently the most common phase modulation method used for photoactivation or imaging in all-optical neurophysiology experiments. Since CGH is described in detail in other chapters of this book (Chaps. 3, 4, and 11), this section will focus on the principles and implementation of an alternative phase modulation approach: generalized phase contrast (GPC) [187].

GPC is an efficient approach for transverse beam shaping and has been applied to imaging [188, 189], photomanipulation [190– 192], and atom trapping [193]. GPC patterns have smooth, speckle-free intensity profiles and can be combined with temporal focusing for depth-resolved, robust excitation, deep in scattering tissue [147, 194]. As demonstrated in Fig. 5a, in GPC, the phase imprinted on a beam (using a phase mask or an SLM) is mapped to intensity variations in a conjugate image plane by engineered constructive and destructive interference. The simplest implementations of GPC are based on 4f arrangements of lenses, constructed as follows (Fig. 5a): the first phase modulating element (hereafter SLM) is located a distance f<sup>1</sup> prior to the first lens (L1), which has focal length f1, and is referred to hereafter as the Fourier lens. The necessary SLM phase (ϕxy(x,y)) depends on the spatial profile of the desired pattern. For binary GPC, ϕxy = ϕ<sup>1</sup> for SLM pixels inside the pattern and ϕxy = ϕ<sup>2</sup> for SLM pixels outside of the pattern, ϕ<sup>1</sup> = π and ϕ<sup>2</sup> = 0 is a simple (and useful) choice. An element known as a phase contrast filter (PCF) is located in the Fourier plane (FP) of L1, and a distance f2 prior to the second lens (L2), which has focal length f2. The PCF applies a selective phase shift to the field in the Fourier plane. The phase shift imparted by the PCF depends on its thickness (d) and refractive index of the substrate (n2): ϕPCF = (2πd(n2-n1))/λ, where n1 is the refractive index of the medium surrounding the PCF (usually n1 = 1 for air, and n2 = 1.45 for a PCF fabricated with fused silica). For binary GPC, and ϕ<sup>1</sup> = π, ϕ<sup>2</sup> = 0, constructive interference in the output pattern occurs for ϕPCF = π. The resulting interference pattern is formed in the image plane (IP) of the second lens, a distance f<sup>2</sup> from L2.

#### 3.2 Beam Shaping with Generalized Phase Contrast

Fig. 5 Wavefront engineering based on Generalized Phase Contrast. (a) (i) Schematic representation of a common configuration for Generalized Phase Contrast. The beam is modulated using an SLM, which is used to impart a phase shift to the portion of the beam corresponding to the desired pattern. The SLM phase should match the desired pattern (up to a magnification factor according to the respective focal lengths of L1 and L2). In the binary case, the SLM is usually used to impart a <sup>π</sup> phase shift to the pixels within the pattern and 0 to those outside. The synthetic reference beam is the portion that is phase shifted by the phase contrast filter (PCF), which typically imparts a <sup>π</sup> phase shift relative to the field that does not pass through the PCF, referred to here as the modulated beam. The different portions of the beam are recombined by L2 in the Image Plane (IP), where the modulated and synthetic reference fields interfere to form the desired pattern. (ii) Cartoon representations of the ideal 2D amplitudes and phases of the electric fields in the input (SLM) plane and the output (Image) plane. The phase profile of a typical PCF is shown centrally, with the filter diameter indicated by dashed black lines. (iii) 1D cross sections of the amplitudes and phases of the electric fields in the case of binary circle GPC. (b) 2-photon excited fluorescence from a thin rhodamine layer for two different patterns: circle and ring GPC. Scale bars represent 10 <sup>μ</sup>m

To some extent, the perceived complexity of GPC arises from the number of different parameters that contribute to pattern fidelity. To elucidate the effects of some of these parameters, their impact on three important metrics of pattern quality relevant to two-photon excitation: efficiency, uniformity, and contrast will be discussed. In this context, efficiency is defined as the fraction of total energy contained within the pattern, uniformity as the inverse of the curvature of intensity within the pattern and contrast as difference between the maximum and minimum intensity in the pattern vicinity ((Imax + Imin)/(Imax-Imin). While it is generally desirable that these metrics are maximized for two-photon excitation based on sculpted light, this cannot be achieved using low NA Gaussian beams, where uniformity throughout the region of interest (typically the neuronal soma) necessitates use of a large beam waist, resulting in low pattern efficiency. To explain how uniformity and efficiency can be jointly maximized in GPC, we will consider a simple example based on an input Gaussian beam and a simple binary pattern commonly used for two-photon excitation: a circular disk of uniform intensity (Fig. 5a, b).

Consider the propagation of the field modulated by the SLM through the system in the absence of the PCF (Fig. 6a, upper). In the image plane, the modulated field is a magnified image (according to the ratio of f2/f1), of the input field with the imprinted phase profile ϕxy. Given the modulated field in the image plane, it is possible to find the corresponding ideal "reference field", which, summed with the modulated field would generate the desired pattern with maximal efficiency, uniformity, and contrast (Fig. 6a, lower). This requires total constructive interference between the reference and modulated fields at all positions in the image plane within the pattern and total destructive interference at all positions outside. Achieving this stringent condition requires that the modulated and reference fields:


In GPC, the reference field is derived from the input field itself: the portion of the field that is phase shifted by the PCF can be considered a so-called "synthetic reference field" (SRF). The propagation of the SRF through the 4f system can be considered separately from the rest of the field (hereafter referred to as the modulated field), as demonstrated in Fig. 6b. The efficiency, uniformity, and contrast in the output pattern are maximized by finding the properties of the PCF such that the SRF approaches the ideal reference field while the modulated field is minimally perturbed. The optimal characteristics of the PCF for satisfying conditions (a)–(d) in the image plane can be deduced by comparing the profiles of the ideal reference and modulated fields to the synthetic reference field in the Fourier plane (Fig. 6b). For instance, it is clear that for the particular binary example of a disk, the synthetic reference field should be phase shifted by π in order to resemble the ideal reference field (Fig. 6b). Secondly, the edges of the PCF should coincide with the first zero-crossings of the modulated field, and thirdly the form of the phase contrast filter should reflect the symmetry of the desired intensity pattern (for instance, the highest fidelity circular patterns are obtained using circular filters, whereas elliptical patterns would benefit from correctly oriented elliptical filters). More complex patterns would benefit from more complex filter shapes, although high efficiencies (>60%) can still be achieved by using more common circular or rectangular filters. In the case of the circular disk pattern with an appropriately sized PCF, the efficiency of the output pattern is theoretically 70–80% and the maximum intensity is 3× higher than the de-magnified input Gaussian beam [195].

Fig. <sup>6</sup> Intuitive optimization of phase contrast filter for GPC. (a) The ideal reference field would generate total constructive interference at all positions in the image plane within the desired pattern and total destructive interference at all positions outside. The ideal reference field can be calculated by subtracting the modulated field (i.e., the magnified image of the field at the SLM plane) from the field corresponding to the desired pattern. The colors of the field profiles represent their phase <sup>ϕ</sup>, (blue: <sup>ϕ</sup> <sup>=</sup> 0 and red: <sup>ϕ</sup> <sup>=</sup> <sup>π</sup>). (b) The Fourier transform (denoted F ) of this ideal reference field gives its profile in the Fourier plane, where the PCF is located. The profiles of the ideal reference field and modulated field in the Fourier plane are used to guide the

The 30% loss in efficiency is mainly a result of the differences between the synthetic and ideal reference fields. Firstly, the SRF has a narrower diameter and shorter amplitude than the ideal reference field in the Fourier Plane (Fig. 6b). Consequently, the SRF in the image plane is broader and of lower amplitude than the modulated field and condition (a) is not met, resulting in a dim halo of light (Fig. 6c) surrounding the pattern due to partial destructive interference. Note that the extraneous light can be blocked using an iris in a conjugate image plane if problematic. Secondly, since the SRF generated using a PCF to transmit the central lobe is constituted of low spatial frequency components, any sharp features in the synthetic reference wave are precluded. This reduces the uniformity of the pattern with respect to the ideal case – since the SRF retains the Gaussian envelope of the input beam (unless a beam shaper is used prior to the first SLM such that it is illuminated with a top-hat beam [195]). For small filters, the SRF approaches the "DC component" of the incident field – a Gaussian envelope for most experimental configurations. As the filter size increases, the SRF more closely resembles a magnified version of the input field, while the modulated field only contains the high spatial frequency components of the input pattern – for instance, the pattern edges and small features. The best pattern (highest efficiency, uniformity, and contrast) is achieved when the edges of the PCF coincide with the first zero crossings of the modulated field.

The concept of minimizing the differences between the synthetic and ideal reference fields to maximize efficiency, uniformity, and contrast is more general than the simple case of a circular disk example presented and has been verified for a variety of analytically tractable patterns [195, 196]. Experimentally, the properties of the synthetic reference and modulated fields depend on interdependent system parameters such as the diameter and profile of the input beam, the spatial profile of the phase imparted by the SLM (i.e., the desired output pattern), and the focal length of L1. Since these parameters are interrelated, it is useful to introduce a level of abstraction and optimize the efficiency, uniformity, and contrast

Fig. <sup>6</sup> (continued) choice of an optimal PCF filter in GPC. The optimal PCF parameters are those for which the synthetic reference field most closely matches the ideal reference field. It is clear that this occurs when the PCF imparts a <sup>π</sup> phase shift and its edges coincide with the first zero crossings of the modulated field (indicated by black dashed lines). (c) Since the synthetic reference field cannot completely match the ideal reference field, there exist some differences between the ideal output field and that which is obtained. In most cases, there is a mismatch between the beam waists of the synthetic reference and the modulated fields, resulting in a "ring-of-light" surrounding the output pattern (highlighted by gray arrows). This is normally blocked by an iris positioned in a conjugate image plane. Furthermore, since the synthetic reference field is typically composed of the low spatial frequency components of the field, there are no small features and the Gaussian profile of the input beam is not compensated for (highlighted by black arrows)

as a function of ξ and η, where ξ is defined as the ratio of the pattern radius at SLM to the waist of the input Gaussian beam and η is defined as the ratio of the radius of the focused beam in the Fourier plane to the radius of the PCF [196]. For certain patterns, the optimal values of ξ and η can be found analytically, or numerically via simulations for more complex patterns.

The best approach for achieving the theoretically optimal values of ξ and η experimentally depends on the precise constraints of the experimental setup:


In optical systems necessitating volumetric two-photon excitation, GPC is generally combined with CGH for flexible 3D pattern projection [197] and temporal focusing to improve the axial resolution [147, 150]. In such systems, the downstream parameters are tightly constrained to achieve a field of excitation with a particular extent, and to meet the conditions described in the following section, necessary to achieve optimal temporal focusing. Hence, the extent of the SLM phase pattern for GPC is typically set according to the desired size of the pattern at the focal plane of the microscope objective and L1 kept fixed, while the input beam and PCF diameters are varied in order to optimize efficiency uniformity and contrast in the output pattern. The optimization process for a given set of experiments is eased by making it possible to tune the diameter of the incident beam without altering its divergence (for instance by having a variety of suitable telescopes mounted on switchable magnetic bases) and additionally by imprinting a selection of suitable PCFs (with a range of diameters and shapes) on a phase mask which is then mounted on a three-axis micrometer stage in order to easily be able to transition between PCFs. For a given experiment, and desired sculpted light pattern, the initial choice of PCF diameter is generally guided by simulations. In lieu of simulations, a sensible starting point is to choose a PCF diameter matched to the beam waist of the unmodulated Gaussian beam in the Fourier plane and then to test several PCFs with similar diameters to maximize efficiency, uniformity, and contrast. Using this strategy, efficiencies greater than 70% can be routinely obtained experimentally.

3.3 Implementing Temporal Focusing Due to their interferometric character, GPC patterns suffer from a lack of axial confinement and optical sectioning [147] (this is in notable contrast to patterns generated using CGH). Temporal focusing has been used to restore axial resolution for GPC, and other extended light patterns that have been used for 2P optogenetics [147, 153, 171, 194]. The implementation of temporal focusing in combination with Gaussian beams has been extensively described in the literature [127, 198–200] and more detailed descriptions may be found in this book (Chaps. 4 and 9).

In summary, an optical element placed in a conjugate image plane of the optical path is used to separate the spectral frequencies (hereafter "colors" for simplicity) of the femtosecond laser pulses. While both diffusers or scatterers and diffraction gratings are suitable optical elements, diffraction gratings offer a more efficient directional separation of the different colos and can be used in conjunction with lasers commonly used for multiphoton microscopy (which exhibit characteristic pulse durations of hundreds of femtoseconds, corresponding to pulse bandwidths of tens of nanometers). Beam expanders are commonly used prior to the scatterer or diffraction grating to adjust the beam diameter in order to achieve the desired Gaussian beam size at the sample. The orientation of the grating is usually chosen such that the first order is diffracted perpendicular to the grating (by convention, this corresponds to θdiff = 0° ) (Fig. 7) to avoid tilted illumination of the image plane at the sample in the case of a large ROI illumination or a large field of excitation. However, this is not generally the same angle that would maximize the light throughput for a blazed grating (the so-called Littrow configuration) (see Note 1).

The groove density of the diffraction grating, G, and the focal length f of the lens used as the tube lens of the microscope (Fig. 7) should be chosen according to the properties of the microscope objective: achieving the tightest axial confinement requires meeting two conditions. Firstly: the extent of the chirped beam L should fill the diameter of the back focal plane (dbfp); the extent of the chirped beam due to linear dispersion induced by the diffraction grating is:

$$L = \frac{d\lambda}{d\kappa} \Delta \lambda = \frac{f}{d\_{\rm G}} \Delta \lambda,\tag{1}$$

with <sup>d</sup><sup>λ</sup> dx the linear dispersion induced by a grating with groove density d<sup>G</sup> = <sup>1</sup> <sup>G</sup> (lines/mm), and Δλ the spectral bandwidth. The second condition for maximal axial confinement is that the instantaneous illuminated area of the scatterer/diffraction grating should be imaged to a diffraction-limited spot at the focal plane of the objective, which occurs when:

$$\frac{c\pi}{\sin\theta} \approx \frac{M\lambda}{2\text{NA}},\tag{2}$$

where c is the speed of light in vacuum, τ is the laser pulse duration, θ is the incident angle of the light beam on the grating, λ is the wavelength, NA the numerical aperture of the objective, and M, the effective magnification between the scatterer/diffraction grating and the sample. When temporal focusing is combined with light patterning (such as CGH, GPC, or other approaches for intensity

Fig. 7 Implementation of temporal focusing with light-shaping methods. (a) Temporal focusing of a Gaussian beam. A diffraction grating placed in a conjugate image plane of the optical path is used to separate the spectral frequencies ("colors") of the femtosecond laser pulses. The grating is illuminated with a parallel Gaussian beam of the appropriate size adjusted through a beam expander, for giving the desired beam size at the sample plane. The orientation of the grating is usually chosen such that the 1st order is diffracted perpendicular to the grating (θdiff <sup>=</sup> <sup>0</sup>° ) to avoid tilted illumination of the image plane. Conjugation of the grating (image) plane to the sample image plane is realized by a telescope consisting of a lens and the microscope objective. (b) In temporal focusing of CGH beams the grating is placed at the image plane of CGH, illuminated with the holographic pattern generated by addressing the corresponding phase on the SLM (inset). (c) Similarly as in CGH, in temporal focusing of GPC beams the grating is illuminated with the intensity pattern generated at the output (image) plane of the GPC configuration, when addressing the SLM with the appropriate, in the simplest case binary, pattern (inset). In all panels <sup>θ</sup> denotes the incident angle of the light-shaped beam onto the grating. In (b) and (c) the beam expander prior to the SLM is omitted for simplicity. (Adapted from Ref. [201])

modulation), the grating is generally positioned in the output plane of the patterning method. In this case, the conditions outlined above for optimum axial confinement remain valid. However, the dimensions of the beam at the back aperture of the objective depend on the patterning method and may vary with respect to the case of a Gaussian beam. Patterned light generated using GPC (or amplitude modulation) resembles Gaussian beams in the sense that they exhibit smooth phase profiles and large depths of focus (confocal parameters). Changing the pattern changes the illumination of the back aperture but does not improve the axial resolution since the linear dispersion of the diffracted beam is unaffected.

The illumination of the back aperture is different in the case of CGH. CGH setups are usually designed such as to illuminate the entire back aperture of the objective – hence extended CGH spots have intrinsically better axial confinement than Gaussian beams of similar lateral extent. Addition of a diffraction grating at a conjugate image plane in the CGH setup for temporal focusing affects the illumination of the objective back aperture in the dispersive direction. Because in CGH configurations the scatterer/grating is illuminated with a focusing beam, the linear dispersion of the field at the back aperture is strongly dependent on the focal length of the lens used prior to the grating/scatterer. A detailed analytical description of the full-width-at-half-maximum (FWHM) of the illumination distribution along the x and y axis at the back aperture (more precisely the back focal plane) of the objective in a CGH-TF setup may be found in Ref. [149]:

$$\text{FWHM}\_{\text{x\_{\text{RFP}}}} = 2\sqrt{2\ln 2} \sqrt{\frac{2\sigma^2 \cos\left(\theta\right)^2 f\_2^2}{f\_1^2} + \frac{2f\_2^2 \Delta \lambda^2}{d\_G^2}} \qquad (3)$$

$$\text{FWHM}\_{\text{Y\_{\text{REF}}}} = 2\sqrt{2\ln 2} \frac{f\_2}{f\_1} \sigma \tag{4}$$

where 2σ is the waist of the Gaussian beam illumination at the SLM, and f1, f<sup>2</sup> are the focal lengths of the lenses used to conjugate the SLM at the back focal plane of the objective (Fig. 7b). To optimally illuminate the back aperture and maximize axial confinement, focal lengths f1, f<sup>2</sup> ought to be chosen according to the constraints set by equations (3), which replaces equation (1), and (4), while also aiming to satisfy equation (2). Since these lenses also dictate the extent of the field of excitation, it may be necessary to compromise field of view for desired axial resolution (or vice versa). For more details refer to Note 2.

Temporal focusing microscopy is inherently a two-dimensional method, since two-photon excitation only occurs in the vicinity of the focal plane of the objective, which, by design, is conjugate to the grating plane. Even if CGH was used to project patterns onto different axial planes, two-photon excited fluorescence would only be excited by the patterns projected onto the grating, the rest being suppressed by temporal focusing. Extending temporally focused excitation to three-dimensional (3D) space requires spatial multiplexing using an SLM in a conjugate Fourier plane following the grating to modulate the phase of each monochromatic beam. Convolution of the temporally focused pattern projected on the grating with the 3D configuration of beamlets at the sample plane creates multiple temporally focused patterns at the position of the beamlets, which are replicas of the original pattern on the grating. A detailed description of the 3D holographic spatial multiplexing implementation on temporally focused Gaussian beams can be found in Chap. 4, describing the technique 3D-SHOT (3D-Scanless Holographic Optogenetics with Temporal focusing) [5, 114].

For greater flexibility in the choice of the excitation shape and size, 3D holographic spatial multiplexing of temporally focused CGH, GPC patterns, or patterns created with amplitude modulation techniques can be used [150]. In those cases what changes in the optical setup is the way the beam is modulated before the grating (Fig. 8). The shape of the beam that is projected onto the grating is the one that is next replicated by 3D point-cloud CGH. For applications where projection of different shapes or spot sizes is necessary, a further variant of the above approaches is to perform light shaping in two dimensions (2D) with a first SLM that is

Fig. 8 Multiplexed temporally focused CGH patterns. Projection of temporally focused patterns in multiple planes consists in a 3-step-approach: 1. beam amplitude shaping, here by CGH, 2. performing temporal focusing, and 3. spatial multiplexing by using a SLM (SLM2) and 3D point-cloud CGH, at a Fourier plane after dispersion of the spectral frequencies on the grating. The pattern generated through phase modulation (inset SLM1) is projected onto the grating (F(X,Y) inset) and replicated by 3D point-cloud CGH to different 3D positions (G(X,Y,Z) inset SLM2). The way SLM2 is illuminated is also shown in the inset. The resulting pattern at the sample is a convolution of patterns F and G. (Reproduced from Ref. [201])

vertically tiled in regions, each one encoding a different pattern, and use the second SLM, also tiled in the same number of regions addressed with different phase profiles that independently control the position in which each pattern is going to be projected at the sample. Such an example is described in reference [150] where the regions of SLM2 are addressed with phase profiles that control only the axial position of the different patterns. A similar approach for projecting different shapes at different positions is also presented in [150].

The choice of the different lenses on this kind of configuration is constrained by the requirement of filling the back aperture of the microscope objective used to project patterns into the sample. All telescopes between the grating and the objective must be accounted for. Thus, in the case of CGH (Fig. 9), for instance, equations (3) and (4) are modified as follows:

$$\text{FWHM}\_{\text{x\_{\text{BFP}}}} = 2\sqrt{2\ln 2} \frac{f\_4}{f\_3} \sqrt{\frac{2\sigma^2 \cos\left(\theta\right)^2 f\_2^2}{f\_1^2} + \frac{2f\_2^2 \Delta k^2}{d\_G^2}} \quad (5)$$

$$\text{FWHM}\_{\text{Y\_{RFP}}} = 2\sqrt{2\ln 2} \frac{f\_2}{f\_1} \frac{f\_4}{f\_3} \sigma \tag{6}$$

Implementation of multiplexed temporally focused light shaping (MTF-LS) either with CGH or GPC, or any other kind of amplitude modulation is in general more demanding in terms of

Fig. 9 Organotypic slices. (a) Organization of the hood for the dissection of organotypic slices. (b) Dissected organotypic slices on PTFE membranes placed on inserts in a 6-well plate. (c) Transmitted light image of a patched cell in a hippocampal organotypic slice. (d) Expression of a nuclear targeted fluorescent protein (mRuby) following bulk infection with an AAV. The architecture of the hippocampus is maintained

alignment and equipment than multiplexing temporally focused Gaussian beams. Care should be taken to align the beam on the two SLMs used, one for light shaping (SLM1; Fig. 8) and the other for 3D point-cloud CGH (SLM2; Fig. 8).

In MTF-LS methods, including Gaussian beams (3D-SHOT), the excitation field is defined by the properties of the SLM used for 3D point-cloud CGH (pixel size and number of pixels or the size of the SLM) and the telescope used to magnify this to the back aperture of the objective ( f 4 f 3 in our schematic). Calibration of the spot position between the SLM and the camera is achieved in an identical manner as for CGH (see Sect. 3.2) and, similarly, the spot intensity over the entire excitation field must be homogenized by calibrating the diffraction efficiency SLM2 both laterally in the image plane (xy) and axially throughout the excitation volume (z) (refer to Note 3 for further details). Depending on the light shaping method used, different calibration procedures for compensating diffraction efficiency in light intensity may be needed. For instance, in methods using SLM1 for controlling the lateral position of the spots on a plane, diffraction efficiency calibration of SLM1 is also necessary, and when the SLMs are tiled in different regions, since light diffracted from each of them is not illuminating the round back aperture of the objective in the same way, a special



#### Table 1 Solutions for organotypic hippocampal slice cultures preparation

a Be aware that antibiotics can affect some cellular properties [202]. It is important to establish that their use will not perturb or introduce any bias into the system under investigation. Organotypic hippocampal slices can also be prepared without antibiotics by maintaining a strict asepsis throughout the entire process


Always use sterile gloves and change them between each dissection.

	- (a) Tissue chopper set to cut 300 μm slices.
	- (b) Dissection stereomicroscope.
	- (c) Sterile transfer pipettes held in 15 mL tubes.
	- (d) Previously sterilized tools in ethanol 70%.
	- (e) Sterile PBS.

Before starting the dissection for each pup, place 4 petri dishes under the hood.


Excess dissection medium affects gas exchange and, consequently, slice health.


#### 3.4.3 Bulk Infection Before infecting, it is essential to let the slices recover and adhere to the membrane for at least 3 days in the incubator following slicing. If possible, opt for AAVs as they are non-pathogenic, non-cytotoxic, and do not integrate in the host genome. Alternatively, to avoid the use of viruses, one can utilize electroporation of plasmids.

When testing a virus for the first time on organotypic slices, it is wise to test different dilutions of the virus, as different combinations of serotypes and promoters can lead to different optimal windows of expression.


#### 3.4.4 Troubleshooting – The slices should flatten and become transparent after a few days in culture (Fig. 9c). Dead slices remain whitish and opaque.

	- For optimal results, oxygenate the external solution.
	- pH should be kept around 7.4 (with HEPES or bicarbonate & bubbling).
	- Take the slice out of the incubator only once everything on the setup is ready.
	- Under optimal conditions the slices may be re-used across multiple experimental sessions.

#### 4 Notes

1. Tilted illumination of the grating in relatively large angles compared to Littrow configuration is also used in order to increase the difference in optical path between the different colors diffracted [126]. In this way the temporal focusing effect is enhanced because the depth of focus where all colors arrive in phase to recreate the ultrashort pulse gets smaller. This helps as well in using temporal focusing with pulses of the range of 100s of fs.


#### 5 Outlook

All-optical neurophysiology is evolving as a useful approach in neuroscience to decode patterns of neuronal activity and understand how these patterns contribute to neural disorders, to cognitive tasks, or to specific behaviors. Important achievements in molecular biology and development of advanced optical methods are contributing to elucidating the neural code. A plethora of optogenetic constructs with variable properties in terms of excitation spectrum, kinetics, and sensitivity can be used in combination with optical methods that provide high spatial specificity and temporal precision. It is now possible to manipulate brain activity at different spatiotemporal scales, throughout large excitation volumes, and, also, to reach deep brain regions [204–206]. One of the latest developments in the field is a bidirectional tool, BiPOLES, based on two potent channelrhodopsins: the inhibitory GtACR2 and excitatory Chrimson [207]. Bidirectional tools have been developed almost since the advent of optogenetics [45, 208, 209] and BiPOLES builds upon twenty years of developments in opsin engineering and trafficking. As a result, both the excitatory and inhibitory opsins are efficiently trafficked to the membrane, with equal sub-cellular distributions and hence a tightly controlled ratio between excitatory and inhibitory action at specific wavelengths and membrane potentials is achieved. This means that neuronal activation and silencing can be controlled precisely and predictably in all transduced cells within a particular population. BiPOLES ought to facilitate a large number of loss- and gain-of function experiments, which are necessary for proving the necessity and sufficiency of a particular circuit for a specific disease, for precisely controlling spike timing without changing firing rates, and also the possibility of the long sought-after optical voltage clamp [210].

In terms of the next steps for optical technology development, all-optical experiments will continue to benefit from the use of lower laser powers, increased acquisition or modulation speeds (for faster acquisition rates/higher temporal precision or larger fields of observation/manipulation), higher spatial resolution and access to deeper brain regions. 2P excitation microscopy with scanners or scanless parallel illumination through spatial light modulators, though fibers or fiber bundles for endoscopic applications, with point spread function engineering for achieving mm<sup>3</sup> excitation volumes, or particular configurations allowing mesoscopic imaging, will continue to be developed for studying neural circuits. 3P microscopy has been already used for morphological and functional imaging beyond a depth of 1 mm, but it has not yet been explored for photoactivating neurons in deep brain regions. Moreover, although multiphoton excitation approaches have advanced separately for imaging or photostimulation of neurons, their combination for all-optical manipulation has evolved at a much slower rate and is limited to a handful of laboratories. All-optical neurophysiology experiments presented so far, usually involve sophisticated photostimulation approaches using wavefront-engineering techniques, combined with standard galvanometric scanning microscopes and electrically tunable lenses for recording responses from neurons in different axial planes. Combining high-speed excitation and recordings throughout large, continuous volumes would have a great impact in the field.

The optical manipulation of neural circuits has the potential to be an extremely potent approach for understanding brain function but requires carefully chosen and calibrated tools and methodology in order to be capable of addressing the specific question under investigation. Now that all-optical manipulation of neurons has become possible more than ever in an on-off basis of cell excitation, finest control of the spatiotemporal characteristics of the excitation patterns that leads to the detection of subtle characteristics in neuronal reaction, is needed. This can help in diversifying the role of the circuit activity itself or in correlation with other circuits, and to observe how this difference may alter network performance or lead to a different behavior.

#### Acknowledgments

We thank Christianne Grimm for critical reading of the chapter. We acknowledge financing from the 'Agence Nationale de la Recherche' (ANR) project ANR-17-CE16-0021 SLALLOM and ANR-19-CE16-0026 HOLOPTOGEN, the IHU FOReSIGHT grant (Grant P-ALLOP3-IHU-000), the National Institutes of Health (Grant NIH 1UF1NS107574 – 01), the ERC Advanced Grant HOLOVIS (ERC-2019-AdG; Award no. 885090), and the FPSU-Chaire AXA-C18/1276.

#### References


m in living brains by use of a Ti: Al2O3 regenerative amplifier. Opt Lett 28:1022–1024


neuroscience. J Neurosci 40:4264. https:// doi.org/10.1523/JNEUROSCI.0103-20. 2020


light-gated potassium channels. bioRxiv:2021.09.17.460684


measured with calcium imaging and electrophysiology. PLoS Comput Biol 16:1–29. https://doi.org/10.1371/journal.pcbi. 1008198


https://doi.org/10.1152/physiol.00036. 2006


imaging in awake behaving animals. Nat Methods 13:1001–1004. https://doi.org/ 10.1038/nMeth.4033


limit. Optica 5:117. https://doi.org/10. 1364/optica.5.000117


using stereoscopy (vTwINS). Nat Methods 14:420–426. https://doi.org/10.1101/ 073742


neural circuits. Nat Commun 8:116. https:// doi.org/10.1038/s41467-017-00160-z


stoichiometric and co-localized expression of light-gated membrane proteins. Nat Methods 8:1083–1091. https://doi.org/10.1038/ nmeth.1766


Open s of the Creative Commons Attribution 4.0 International Licen 0/), which permits use, sharing, adaptation, distribution and r ou give appropriate credit to the original author(s) and the sourc e and indicate if changes were made. Access This chapter is licensed under the term se (http://creativecommons.org/licenses/by/4. eproduction in any medium or format, as long as y e, provide a link to the Creative Commons licens

Th er are included in the chapter's Creative Commons license, unles erial. If material is not included in the chapter's Creative Com ted by statutory regulation or exceeds the permitted use, you w opyright holder. e images or other third party material in this chapt s indicated otherwise in a credit line to the mat mons license and your intended use is not permit ill need to obtain permission directly from the c

## Balancing the Fluorescence Imaging Budget for All-Optical Neurophysiology Experiments

### Peter Quicke, Carmel L. Howe, and Amanda J. Foust

#### Abstract

The goal of this chapter is to establish a framework to evaluate imaging methodologies for all-optical neurophysiology experiments. This is not an exhaustive review of fluorescent indicators and imaging modalities but rather aims to distill the functional imaging principles driving the choice of both. Scientific priorities determine whether the imaging strategy is based on an "optimal fluorescent indicator" or "optimal imaging modality." The choice of the first constrains the choice of the second due to each's contributions to the fluorescence budget and signal-to-noise ratio of time-varying fluorescence changes.

Key words Fluorescence, Calcium imaging, Voltage imaging, Neurophysiology, One-photon, Multiphoton

#### 1 Introduction

Optical methods provide powerful means to investigate the structure and function of many neurons simultaneously. Importantly, photons can be focused to, and imaged from, multiple neurons in parallel to control and detect their activity. Several new imaging strategies are developed every year, and neurophysiologists must navigate growing stacks of methods papers, microscope, and laser adverts to identify which modality can best achieve the scientific goals of their experiments. There are a broad set of competing requirements on the techniques used to image neuronal activity including speed, depth, spectral separation, robustness to scattering, and motion artifacts. The goal of this chapter is to distill the trade-offs driving choice of imaging strategy for all-optical experiments.

We begin by detailing two key challenges to imaging neuronal activity in intact brains. We then define signal-to-noise ratio and the concept of a "fluorescence budget," which ultimately determines how small and fast of a transient signal can be resolved by a given

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_2, © The Author(s) 2023

imaging configuration. We then detail how the choice of fluorophore, in particular indicators of calcium and membrane potential, and imaging modality relate to the fluorescence budget, and the trade-offs inherent to the choice of each.

#### 2 Key Challenges to Imaging Neuronal Activity

2.1 Challenge 1: Brains Are Three-Dimensional Brains are three-dimensional (3D), while conventional microscopes image one plane at a time. A widefield fluorescence microscope (Fig. 1, left) excites fluorescence throughout a 3D volume and images fluorescence from a single plane. Unless the fluorophore itself is restricted to one plane (e.g., culture cell monolayers), fluorescence is excited outside of the plane of focus (Fig. 2, left). At the camera chip, this out-of-focus light reduces the contrast of the in-focus image. Worse, in the context of imaging calcium or voltage signals, the labeled out-of-focus cells have individually varying fluorescence time courses that contribute to the in-focus time course. This makes standard widefield fluorescence imaging with single-cell resolution often untenable in densely labeled 3D samples. Many techniques have been developed to restrict fluorescence excitation (two/three-photon, light-sheet microscopy) and/or collection (confocal variants) to a single plane of focus

Fig. <sup>1</sup> Widefield, one-photon imaging (left) and multiphoton scanning (right) imaging modalities; not to scale

Fig. 2 The challenges of 3D imaging in brain tissue. (a) Techniques that are not optically sectioning suffer from blur caused by the collection of light from out-of-focus planes. (b) Light rays can be scattered before exiting the tissue, changing their apparent origin. In aggregate, this leads to a blurring effect that drastically increases with depth. Reproduced from [88], CC BY-SA 4.0

and/or remove out-of-focus fluorescence in post-processing by combining multiple images with structured illumination [22, 63, 73, 76]. The ability to localize fluorescence to a plane orthogonal to the optical axis is termed "optical sectioning." This "optical section" can scan orthogonal to the plane of focus to build up a volumetric image plane by plane.

2.2 Challenge 2: (Most) Brains Scatter Light While certain model organisms (e.g., larval zebrafish, C. elegans) are transparent, most brains, especially mammalian brains, scatter light strongly. Fluorescence excitation efficiency decreases with increasing imaging depth as excitation light is scattered and absorbed, resulting in the degradation of the excitation pointspread function (PSF). Moreover, light excited in one spatial location can be scattered before collection and detected as though arising from somewhere else (Fig. 2, right). This degrades image contrast and confuses analysis of fluorescence time courses in adjacent areas.

Together, out-of-focus fluorescence and scattered fluorescence produce blurring, which drastically increases with imaging depth [46]. The need for optical sectioning and robustness to scattering has driven development and application of two- [30] and threephoton [45, 121] (or "multiphoton") point scanning modalities, which achieve both. Due to non-linear dependence on excitation intensity, two- and three-photon fluorescence excited by high numerical aperture (NA) focusing generates femtoliter fluorescence volumes. Non-descanned fluorescence generated at the focus, scattered and non-scattered, is collected through the objective onto a large area detector (Fig. 1, right), commonly a photomultiplier tube. Two-dimensional images or 3D volumes are built up point by point by scanning the focus. This strategy enables calcium-based imaging of action potentials (APs) at depths up to 1 mm in mouse brains [59]. These techniques optically section and are robust to scattering; for structural imaging in live brains, this is as close as we have to perfect. Why, therefore, consider a different strategy for functional imaging? The next section introduces the key concept and guiding principle to designing and selecting a modality to image neuronal activity: the "fluorescence budget."

#### 3 Signal-to-Noise Ratio Is King: Fluorescence Budget

The principal consideration guiding optical neurophysiology systems is the signal-to-noise ratio (SNR) of time-varying fluorescence changes. Here the term signal-to-noise ratio most commonly refers to the ratio of the amplitude of a time-varying signal (S) transient to the "baseline noise" (σ), typically the root mean square of the signal that precedes or follows a transient peak (Fig.3). Note that this definition differs from how signal-to-noise ratio is often defined in scientific and engineering disciplines (see Note 1). Note also that in this chapter we are dealing specifically with the SNR of temporal, not spatial, fluorescence changes.

The temporal SNR quantifies the ease with which a fluorescence transient can be resolved from a noisy time-varying signal. The SNR is equal to the product of the fractional change in fluorescence with respect to baseline, or "indicator sensitivity,"

Fig. <sup>3</sup> Signal-to-noise ratio can be defined as the ratio of the amplitude of a time-varying signal (S) transient to the "baseline noise" (σ)

multiplied by the total fluorescence collected per location (pixel or voxel), per integration period, the "fluorescence budget" (see Note 2):

$$SNR = \frac{\Delta F}{F\_0} \sqrt{F\_0}.\tag{1}$$

ΔF/F<sup>0</sup> reflects the sensitivity of the indicator's fluorescence to the physiological signal (e.g., membrane potential or calcium). The fluorescence budget, F0, is given by

$$F\_0 = \phi \, F\_{\mathcal{g}^{\varepsilon n}},\tag{2}$$

where ϕ (unitless) is the fluorescence collection efficiency and Fgen (in photons) is the fluorescence generated per location (pixel or voxel), per integration period. ϕ is determined by an assortment of factors dependent on the sample, such as imaging depth, wavelength-specific absorption, scattering length, and anisotropy, and specific to the collection optics, such as acceptance angle (numerical aperture, NA) and detector efficiency [129]. ϕ quantifies what fraction of fluorescence is collected by the imaging system. The fluorescence generated is

$$F\_{\mathcal{J}en} = R\_{Fl} \gets\_{Fl} V\_{Fl} \,\Delta t,\tag{3}$$

where RFl is the fluorescence rate (photons/s/molecule), CFl is the fluorophore concentration (molecules/m3 ), VFl is the per location fluorescence excitation volume (m<sup>3</sup> ), and Δt is the integration period (s). The fluorescence rate is given by

$$R\_{Fl} = \sigma\_{\mathfrak{n}} < I^{\mathfrak{n}} > \tag{4}$$

in photons/s where σ<sup>n</sup> is the n-photon brightness (1-, 2-, or 3 photon), the wavelength-dependent product of cross-section and quantum yield (m2<sup>n</sup> s n-1 /photonsn-<sup>1</sup> ), and < I <sup>n</sup> > is the time average of the fluorescence excitation intensity raised to the n-th power (photons<sup>n</sup> /m2<sup>n</sup> /s<sup>n</sup> ).1 Ideally, < I <sup>n</sup> > is maximized to increase SNR up to the limit of photobleaching, heating, and in the case of ultrafast pulsed excitation, non-linear damage thresholds. Substituting gives

$$SNR = \frac{\Delta F}{F\_0} \sqrt{\phi \,\sigma\_n < I^n > C\_{Fl} \,\, V\_{Fl} \,\Delta t}. \tag{5}$$

The two imaging modalities introduced in the previous section, widefield and multiphoton scanning, are at opposite extremes in terms of fluorescence budget. Widefield integrates photons from all locations in a two-dimensional frame simultaneously throughout the frame period, and hence:

<sup>1</sup> Intensity is often normalized to photon energy in the multiphoton imaging literature ], different from the standard definition of W/m<sup>2</sup> . [130

$$
\Delta t = \frac{1}{R\_{a\epsilon q} N\_Z},\tag{6}
$$

where Racq is the frame acquisition rate and NZ is the number of planes imaged per volume. In contrast, multiphoton point rastering modalities integrate fluorescence from each location for only a small fraction of the acquisition period

$$
\Delta t = \frac{1}{R\_{\text{eq}} N\_X N\_T N\_Z},
\tag{7}
$$

where NX and NY are the lateral frame dimensions.

Consider the case where NZ = 1 to image a single plane, such that Racq is the frame rate, and compare the widefield to the multiphoton raster scan case. For the widefield case, Δt = 1/Racq. For the multiphoton scanning case, digitizing each frame for example as NX = NY = 256 pixels, we see that Δt = 1/(2562 Racq). If we base our SNR comparison solely on differences in Δt, we see that the SNR for the widefield case is ffiffiffiffiffiffiffiffiffiffi <sup>256</sup><sup>2</sup> <sup>p</sup> <sup>=</sup> 256 times that of multiphoton scanning! Note that this factor could be even bigger if the scanning modality involves significant "dead time" (i.e., galvonometric scanner turn around). Cameras also require readout time, during which they are not integrating photons; however, this is typically around 10 μs per line for a modern sCMOS camera, a small fraction of the frame period, while scanning dead time can be around 10–20% of each frame.

The optical sectioning and robustness to scattering achieved by 2P and 3P point scanning modalities over widefield comes at high cost to the fluorescence budget, F0. Note that in the multiphoton point scanning case, F<sup>0</sup> is inversely proportional to the product of the number of locations (pixels or voxels, NXNYNZ) monitored, and the rate at which they are monitored, Racq. This implies that F<sup>0</sup> can be maintained by performing high Racq acquisition of few locations or low Racq acquisition of many locations. For example, at the high rate extreme, 10 kilohertz acquisition of voltage transients from a single "voxel" has been achieved in two photon by parking the beam over a dendritic spine [3]. At the other extreme, two-photon "mesoscopes" image FOVs multiple millimeters wide at low frame rates (0.1–10 Hz) [102, 106] using slow and highly sensitive calcium indicators [25, 29].

From Eq. 5, we see that SNR can be increased by: 1. use of a high sensitivity (large ΔF/F0) fluorescence reporter and/or 2. increasing F0. F<sup>0</sup> can be increased by: maximizing ϕ with efficient photo-sensors, high collection NA, etc.; selecting a bright fluorophore and exciting it at wavelengths for which its brightness, σn, is highest; maximizing fluorophore concentration, CFl, and exciting and collecting fluorescence over the largest possible voxel, VFl; exciting the fluorophore with the highest feasible intensity,


#### Table 1 Summary of variables determining temporal SNR for functional fluorescence imaging

< I <sup>n</sup> > ; and maximizing integration time, Δt. Table 1 summarizes the parameters contributing to SNR of functional fluorescence signals. The next sections discuss trade-offs driving choice of fluorophore and imaging modality.

#### 4 Choosing a Fluorescent Indicator

A fluorescent indicator is an organic molecule or protein functionalized to transduce biophysical changes into changes in fluorescence. Most commonly used are indicators that respond to changes in membrane potential [11, 13] and calcium [7, 112], but others exist that respond to pH [93, 94], sodium [56, 61], and neurotransmitter concentration including glutamate [69, 70], acetylcholine [52], or GABA [71]. With regard to temporal SNR, the key parameters of such a molecule include its sensitivity (ΔF/F0) and brightness (σn) as described above and its temporal profile.

#### 4.1 Temporal Considerations Here, indicator "temporal profile" refers to the convolution of the time course of a biophysical process (e.g., fast membrane potential variations vs. slow changes in calcium or neurotransmitter concentration) with the kinetics of the fluorescent indicator. Here we refer to τOff as a general term for the rate of decay back to baseline of the fluorescence transient induced by an action potential or other brief event. We note, however, that many indicators' kinetics are not described by a simple mono-exponential decay. Importantly, τOff

determines the minimum Racq (and hence maximum Δt) able to resolve the time-varying signal without aliasing. For example, some highly sensitive Fo¨rster resonance energy transfer (FRET)-based voltage indicators temporally low pass filter fast events such as action potentials due to slow translocation of mobile charges in the plasma membrane [41]. While considered a disadvantage in most scientific contexts, this low pass filtering is an advantage for imaging, as slower transient changes in fluorescence can be imaged at lower rates Racq without aliasing, thereby increasing Δt and F0. The minimum Racq is defined by the functional fluorescence temporal profile and also by the goals of the experiment. For instance, Racq can be relatively low for mere transient detection but must be much higher to characterize transient timing and kinetics [90]. The implications of differing τOff for imaging strategy design can be readily appreciated through comparison of calcium and membrane potential indicators for neuronal AP detection.

4.2 Membrane Potential vs. Calcium The majority of all-optical neurophysiology experiments monitor neuronal membrane potential, either directly with voltage-sensitive fluorophores or indirectly through fluorophores sensitive to calcium concentration. In most cases, calcium transients are monitored to detect when a neuron fires an action potential; so why not always choose voltage-sensitive fluorophores to image membrane potential directly? The particular challenges of imaging membrane potential compared to calcium are summarized in Fig. 4. First, the physiological signal kinetics are fast, milliseconds in the case of action potentials, necessitating high Racq. High sampling rates limit the available photon integration time, Δt, demanding brighter (high σn) indicators for adequate SNR as discussed in Note 2.

While calcium indicators are distributed throughout the cytosolic volume, voltage indicators are confined to the membrane. The 100-mV membrane potential fluctuation during an action potential leads to an electric field change of around 3 × 107 V/m across the 3-nm-thick plasma membrane. Despite this large field, the fluorescent indicator's sensor molecules must orient across the neuron's external plasma membrane to sense any changes. This causes multiple challenges. First, the potential for physiological disruption by membrane indicator expression limits labeling density (and hence CFl). Membrane capacitance also increases with the number of charged or polarizable molecules in the plasma membrane, and over-labeling can even abolish action potentials completely [18]. Second, highly lipophilic dyes or poorly targeted genetic constructs can label membranes non-specifically, including internal membranes not exposed to changing fields. This increases background relative to the voltage-dependent signal, effectively reducing ΔF/F<sup>0</sup> and proportionally the SNR. Third, overlapping membranes from adjacent cellular processes in densely labeled samples are indistinguishable in most microscope images. This

Fig. 4 The challenges of voltage imaging. Three issues make voltage imaging more challenging than calcium imaging. First, faster intrinsic kinetics limit the photon integration period. Second, voltage indicators must lie in the membrane or degrade the signal; this limits the volume of indicators that can be integrated to measure the signal. Lastly, membranes where signal arises are tightly packed in the brain; fluorescent signals from overlapping membranes wash out singlecell signals. Adapted from original in [88], CC BY-SA 4.0

results in signal mixing, especially with widefield single-photon illumination. Voltage transients from individual cells are then "washed out" by the bright background from adjacent cells.

Calcium indicators are widely used in neuroscience to indirectly detect APs due to their relative ease of use compared to voltage indicators. These indicators change brightness when bound to Ca2+ ions. As AP firing in most neurons is accompanied by a rapid, transient increase in intracellular Ca2+ concentration, calcium sensors report a proxy for neuronal spiking. The increase in intracellular Ca2+ on AP firing due to the opening of voltage-gated channels can be "amplified" by release from internal stores [14] and results in up to an order of magnitude change with a slow decay over hundreds of milliseconds [14, 60]. This low pass filters AP waveforms and enables longer integration times (Δt) without signal aliasing, increasing photon counts. Many more calcium indicators can be loaded into the cell compared to voltage indicators as calcium concentration changes throughout the cytosolic volume, increasing signal brightness. Combined, these advantages have made calcium imaging a popular technique enabling high SNR optical recording of AP activity.

Despite these advantages, there remain significant drawbacks to calcium indicators that limit their applicability. Calcium transients are not exclusively linked to action potential firing in all neurons [42, 68, 74]. AP-evoked calcium dynamics are also slow compared to APs, and indicator kinetics often further reduce the fluorescence response speed. Although beneficial for imaging, this limits the accuracy with which the timing of the underlying electrophysiological activity can be estimated, even when the link between APs and calcium transients is clear. Further complications confusing the link between indicator brightness and electrophysiology include indicator saturation, intrinsic and indicator buffering, and diffusion biophysics [95]. Increased calcium buffering due to high indicator concentrations has also resulted in pathology [72, 105]. Possibly the most important disadvantage to calcium indicators is their inability to report subthreshold membrane potential changes. Calcium indicators can be used to image calcium influx related to synaptic events [17]; however, at the soma, they report suprathreshold AP activity. These factors have motivated development of better optical, chemical, and biological techniques for imaging voltage in neurons. Both the relatively low ΔF/F<sup>0</sup> of these indicators and high required RFl, however, necessitate careful consideration of the photon budget (F0) when developing imaging strategies.

In summary, voltage indicators feature small ΔF/F<sup>0</sup> and fast kinetics (short τOff) that require imaging with high fluorescence budget, especially long Δt, modalities (e.g., widefield, light field) to maintain SNR. Calcium indicators feature comparatively high ΔF/F<sup>0</sup> and calcium transients decay slowly, enabling imaging at low Racq. These features render calcium indicators eminently compatible with low fluorescence budget modalities such as multiphoton raster scanning.

4.3 Spectral Considerations Of particular concern for all-optical experiments is avoiding spectral crosstalk between photostimulation and fluorescence excitation light. Such crosstalk comes in two types: imaging and physiological. Imaging crosstalk occurs when photons meant to stimulate neurons spuriously excite the functional fluorescent indicator. Imaging crosstalk can be avoided altogether if the photostimulation wavelength does not efficiently excite the indicator (e.g. [40]). If not, fortunately, transient artefacts due to imaging crosstalk can be predicted or measured, and subtracted, even in real time in some configurations [35], and generally do not affect the scientific integrity of the data. Physiological crosstalk occurs when light intended to excite the functional indicator fluorophore is spuriously absorbed by the opsin, causing neuronal de- or hyperpolarization. In contrast with optical crosstalk, physiological crosstalk, even subthreshold, undermines the scientific validity of the data by compromising the membrane potential, and in some cases action potential rate and timing, of the imaged neurons. Indeed the photocurrents spuriously generated by the imaging light cannot be subtracted; they must be prevented to realize the scientific potential of all-optical neurophysiology.

There are two ways to prevent physiological crosstalk in all-optical experiments: spectral separation and spatial separation. Spectral separation refers to exciting the indicator fluorophore at wavelengths at which the opsin actuator cross-section is so small that no photocurrents can be measured by whole-cell patch clamp in the opsin-transfected cells. Importantly, this control must be measured at the intensities and durations needed for imaging the fluorescent indicator. For one-photon excitation, this has been achieved by pairing blue–green-absorbing opsins with red-shifted calcium dyes [104], voltage dyes [64, 114, 119], GEVIs [1, 34, 47], and GECIs [7, 124]. Alternatively, red-shifted opsins, such as C1V1 [125], ReaChR [65], or Chrimson [57], can be combined with green-emitting calcium indicators (see also Chapters 3 and 4), although these red-shifted opsins exhibit 20%–30% actuation efficiency under blue-light excitation [116] and are thus more susceptible to spectral crosstalk than pairings where the opsin is excited at shorter wavelengths than the indicator. Spatial separation refers to limiting [107] or eliminating the fluorescence excitation light incident on opsin-expressing cells and substructures. For instance in cases where the opsin is targeted to the soma [10, 67, 99] or to a specific neuronal subpopulation, the fluorescence excitation light could be patterned exclusively over non-opsin-expressing structures and cells. Efforts to reduce crosstalk in one-photon excitation schemes with large spectral overlap between opsin actuators and indicators (e.g., the actuator channelrhodopsin-2 [ChR2] + GCaMP calcium reporters) have minimized read-out light intensities (< In > ) [44, 107] to the detriment of SNR.

Multiphoton schemes for opsin actuation and imaging have yet to demonstrate a configuration completely free of physiological crosstalk. Finding a scheme in which the imaging laser does not evoke spurious photocurrents, producing sub- and/or suprathreshold membrane potential changes, is especially difficult due to broad opsin two-photon action spectra. Spurious suprathreshold activation has been reduced or avoided by using opsins and indicators with partially separated absorption spectra (e.g., C1V1 with GCaMP6s [83]; ChR2, GtACR2, or stCoChR with jRCaMP1a [36, 37]) and by limiting the imaging dwell time, although subthreshold actuation may still occur. Broad multiphoton spectra can provide an advantage for imaging-only configurations (without an opsin) by exciting multiple fluorophores simultaneously with a single wavelength (e.g., green-emitting OGB-1 or GCaMP with red-emitting SR101 [16, 75]). Excitation to a higher-energy electronic excited state has also enabled multi-fluorophore three-photon imaging with a single wavelength [48].

4.4 Fluorophore Spatial Distribution Of critical importance to SNR is the fluorescent indicator spatial distribution, both within a cell and within a population of cells. The fluorescent indicator properties determine the maximum ΔF/F0, but this is effectively reduced in proportion to the amount of "useless" or "background" fluorescence in the cell and surrounding space. Background fluorescence reduces the SNR as the non-signalcontaining photons are collected from the same ROI as signalcontaining photons, reducing the fractional change in fluorescence. If the baseline fluorescence rate from signal-containing molecules is given by F0, and the rate of background fluorescence is given by FB, then the fractional change in fluorescence is reduced to

$$\frac{\Delta F}{F\_0 + F\_B} = (1 - f\_B) \frac{\Delta F}{F\_0}, \quad \text{where} \quad f\_B = \frac{F\_B}{F\_0 + F\_B}, \qquad (8)$$

the fraction of fluorescence contributed by the background. The SNR is then given by

$$\text{SNR} = (1 - f\_B) \frac{\Delta F}{F\_0} \sqrt{F\_0 + F\_B} = \sqrt{1 - f\_B} \frac{\Delta F}{F\_0} \sqrt{F\_0} = \sqrt{1 - f\_B} \text{SNR}\_{0,1} \tag{9}$$

þ as F<sup>0</sup> F <sup>B</sup> = <sup>F</sup><sup>0</sup> <sup>1</sup> - <sup>f</sup> [58]. <sup>B</sup>

Dense fluorophore labeling poses problems especially for voltage indicators due to their membrane localization. Voltage signals in densely labeled samples cannot be resolved without indicator somatic restriction to reduce fluorescence contributions from overlapping adjacent processes (e.g., [2, 4, 117]) or imaging at sub-micron resolution, which has yet to be demonstrated. This problem is mitigated in preparations where the labeled cells are non-adjacent or "sparse." For example, single neuron, single-trial action potential GEVI imaging has been achieved by expressing the GEVI strongly and sparsely in a subpopulation of cortical layer 2/3 excitatory neurons [90] due to the high effective ΔF/F0.

In summary, the sensitivity, brightness, kinetics, and spatial distribution of a fluorescent indicator determine which imaging modalities can resolve the functional fluorescence transients. Bright, slow, and sensitive indicators can boost temporal SNR for low fluorescence budget imaging modalities. For example, if interested in membrane potential, but wanting to track the activity of many cells with scattering-robust two-photon imaging, one can compensate two photon's low fluorescence budget with a slow, bright calcium indicator.

#### 5 SNR and Imaging Modality

Fluorescence imaging systems are comprised of two subsystems: (1) the fluorescence excitation subsystem and (2) the fluorescence detection subsystem. Here we detail how the characteristics of each subsystem contributes to temporal SNR.

#### 5.1 Fluorescence Excitation Regarding SNR, the fluorescence excitation subsystem can be characterized in terms of light intensity, < I <sup>n</sup> > , and the per pixel, per integration period excited volume, VFl. Together with the fluorophore cross-section (σn), concentration (CFl), integration period (Δt), and spatial distribution, these parameters determine the maximum available fluorescence excitation budget (Fgen; Eqs. 3, 4). The illumination wavelength and fluorophore cross-sections determine whether the system favors one-photon, two-photon, or threephoton fluorescence excitation. Two- and three-photon excitation requires ultrafast pulsed lasers with megahertz2 repetition frequencies to achieve RFl sufficient for imaging. One-photon fluorescence excitation rates, RFl, vary widely depending on indicator brightness (σ1) and illumination intensity (< I > ) and generally exceed RFl for two- and three-photon modalities, which typically excite < 0.1 photons per laser pulse [101].

5.1.1 Fluorescence Excitation Volume, VFl The fluorescence excitation rate critically depends on the degree to which the fluorescence excitation is parallelized. Widefield excitation, which excites fluorescence in all locations throughout a volume simultaneously, features the highest degree of parallelization, thus maximizing the fluorescence excitation budget. Focusing a laser beam to a diffraction-limited point and serially scanning that point correspond to lowest parallelization and excitation budget. In between widefield and scanned point excitation, the excitation light can be sculpted into many forms, including a large point (scannedtemporal focusing; S-TeFo, [87, 118]), multiple scanned (spinning disk confocal [108, 126], multifocal 2P [15, 55, 78, 89, 98, 120, 127]) or static (computer-generated holography, [23, 32, 85, 122]) points, a line (TeFo line scanning [28], SLAP [53], vTWINS [103], Bessel beams [19, 66]), whole planes [5, 20, 51, 54, 82, 96, 97, 128], and extended shapes patterned directly onto structures of interest [21, 39, 79, 109, 110]. It is important to note that not all fluorescence photons contribute useful signal. For example, a higher proportion of photons excited through two-photon point scanning contribute to image formation compared to widefield imaging, where photons excited outside the plane of focus are not imaged and can smear the temporal signals extracted from in-focus ROIs. It is also important to note that VFl introduced earlier in this chapter refers to the per location fluorescence volume, not the total spot, line, or sheet volume, and therefore depends on the spatial discretization performed by the collection subsystem.

<sup>2</sup> Imaging a 128 × 128 pixel FOV at 10 Hz with one pulse per pixel, the lower limit, requires a 10 × 128<sup>2</sup> = 0.16 MHz repetition rate.



5.1.2 Fluorescence Integration Time, <sup>Δ</sup>t The fluorescence excitation subsystem determines the relationship between Racq and Δt. In particular, for excitation that does not move or change shape during the acquisition period, Δt = 1/Racq. Scanning generally reduces Δt in proportion to the number of locations scanned. Table 2 summarizes Δt for the different scanned excitation shapes. Bearing in mind that SNR / ffiffiffiffiffiffi <sup>Δ</sup><sup>t</sup> <sup>p</sup> , we appreciate the power of parallelization to boost the fluorescence budget (F0), enabling imaging of smaller and faster signals, and/or over larger fields of view. Fluorescence parallelization, however, reduces robustness to scattering, as discussed next.

5.2 Fluorescence Detection The fluorescence detection system determines how the fluorescence excitation budget is exploited to form images. Sensors for fluorescence detection fall into two categories: single and multichannel.

5.2.1 Single-Channel Detectors Single-channel detectors, including photodiodes and photomultiplier tubes (PMTs), read out fluorescence intensity (or photons) as a function of time. Imaging is achieved by combining singlechannel detectors with scanned fluorescence excitation through the process of "temporal multiplexing": the localization of fluorescence based on when it was detected. Moreover, temporal multiplexing can be used to scan multiple areas or z-planes by alternating the focus of time sequential laser pulses [12, 24, 26, 49, 106, 118]). The degree of temporal multiplexing, along with the total rate of fluorescence excited from the sample (RFl), is ultimately determined by the indicator's fluorescence lifetime [26].

Single-channel detection of point-scanned two- and threephoton fluorescence excitation features the highest achievable robustness to scattering and finest optical sectioning, owing to temporal multiplexing. These advantages, as previously discussed, are achieved at the cost of fluorescence excitation bandwidth, even when fluorescence rates (RFl) are maximized through pulse energies and repetition rates increased to the maximum allowable by photo-damage thresholds and fluorophore lifetimes, due to Δt's dependence on the number of voxels (NX × NY × NZ rastered, or Nlocs randomly accessed).

Intermediate techniques excite fluorescence from extended regions while collecting fluorescence on a single-channel detector and postprocess the signal to recover a 2- or 3-dimensional image. Notable examples include "Scanned Line Angular Projection" (SLAP) [53], Bessel beam scanning [66], "volumetric two-photon imaging of neurons using stereoscopy" (vTwINS) [103], and multiplane imaging [123]. These can considerably increase F<sup>0</sup> for the same Δt compared to traditional single-point scanning while remaining robust to scattering. This comes with the caveat that the often complex and computationally expensive reconstruction techniques typically require the imaged sample or activity to be sparse.

With single-channel detection, the fluorescence excitation volume, VFl, is equal to the total volume excited by the spot, line, or sheet. Assuming that RFl remains constant, SNR increases in proportion to ffiffiffiffiffiffiffiffi <sup>V</sup> Fl <sup>p</sup> . Hence, scanning with a large spot [87] increases fluorescence excitation budget at the cost of spatial resolution, which is also determined by the fluorescence spatial profile or "point-spread function" (PSF). The ability to attribute fluorescence to individual neurons depends on the PSF and the sparsity of the fluorescent indicator labeling.

5.2.2 Multi-channel Detectors Multi-channel detectors for optical neurophysiology include oneor two-dimensional arrays of PMTs or photo diodes, and cameras, primarily charge-coupled device (CCD) and complementary metal oxide semiconductor (CMOS). Multi-channel devices enable "spatial multiplexing": the localization of fluorescence based on where it was detected on the array. With the notable exception of computational reconstructions based on structural image priors [53], imaging parallelized fluorescence excitation (multiple points, lines, sheets, widefield) requires spatially multiplexed detection, in most cases imaging a two-dimensional plane onto an array detector or camera. For volumetric imaging, the imaging plane can be scanned with the fluorescence excitation by moving the objective or sample, with electrically or acoustic gradient tunable lenses, or by remote focusing [8, 19]. Alternatively, the imaging depth of field can be extended to encompass the entire volume through, for example, wavefront coding [81, 92] or intentional spherical aberration [113]. Volumetric imaging can also be achieved with light field microscopy, which uses a microlens array to encode positional and angular information, enabling reconstruction of full volumes from a single two-dimensional frame [77]. Light field microscopy's high fluorescence budget has recently been exploited to image both neuronal calcium [43, 50, 80, 84, 86, 100] and membrane potential [6, 27, 91].

For multi-channel detection, VFl is equal to the volume of excited fluorescence "seen" by each pixel of the detector. Therefore, imaging at the lowest feasible magnification and/or binning the fluorescence detected by pixels into regions-of-interest post hoc, both benefit fluorescence budget and SNR at the cost of spatial resolution.

All strategies that combine fluorescence parallelization with multi-channel detection feature contrast that decreases quickly with depth in scattering brain, because with multiple detectors scattered photons can no longer be localized with certainty to a single location. Thus increasing the photon budget by parallelizing collection reduces the depth in scattering tissue at which functional fluorescence signals can be imaged.

#### 6 Summary of Key Points

The choice of fluorescent indicator and imaging strategy determine SNR through each's contributions to the fluorescence budget, summarized in Fig. 5. Scientific priorities ultimately drive whether the experiment is designed around an "ideal indicator" or "ideal imaging modality," which then constrains the choice of the other to achieve sufficient SNR at the minimum required acquisition rate, Racq. Figure 5 describes three example modality/indicator combinations and situates them with respect to relative contributions of each to SNR. For example, two-photon mesoscopy serially scans many locations and hence features low Δt, which is compensated through GCaMP6s's high sensitivity (ΔF/F0), brightness (σ2), and long τOff, which accommodates low Racq [102]. In contrast, widefield imaging of fast voltage indicators, such as Di-4-ANEPPS analogs [9], relies on widefield's large fluorescence budget to resolve membrane potential changes at kilohertz frame rates.

While this chapter has focused on SNR for shot noise-limited imaging strategies, it is important to bear in mind other noise sources. Importantly, instrument noise can dominate in low light or low ΔF/F<sup>0</sup> regimes. In vivo, noise arising from the sample, including respiratory, cardiac, and other motion, can dominate [33]. However, physiological noise occurs in distinct frequency bands that can often be compensated or subtracted [38].

The problem of physiological crosstalk, in which imaging light spuriously actuates changes in membrane potential due to broad opsin action spectra, has not been fully addressed for multiphoton excitation. Physiological crosstalk could be completely avoided, in principle, by restricting fluorescence to structures or cell populations that are not illuminated by the imaging laser.

A key take home is that indicators and imaging modalities featuring the highest fluorescence budgets enable the highest acquisition rates over the largest number of locations. This concept

Fig. 5 Balancing indicator and imaging strategy contributions to SNR. The imaging modality (upper triangle) and fluorescent indicator (lower triangle) properties together determine fluorescence transient SNR (horizontal axis; equation, top). Importantly, the SNR of a "low fluorescence budget" imaging modality can be compensated by a high <sup>Δ</sup>F/F0 or "high fluorescence budget" indicator and vice versa. The vertical dashed lines situate three example indicator/modality pairings with respect to each's relative contribution to SNR

> is reviewed in detail for multiphoton modalities by [62]. However, high budget modalities also generally do the least to mitigate light scattering effects, limiting the depth at which functional fluorescence transients can be resolved. When comparing candidate imaging modalities for all-optical experiments, careful inspection, in particular of VFl and Δt, can enable reasonable prediction of how

SNR would compare to that of alternative strategies. SNR is the most important figure of merit to consider when designing an optical physiology imaging strategy, as it encompasses a variety of variables and ultimately determines whether or not the experiment will be able to detect the biophysical phenomenon of interest.

#### 7 Notes

#### 1. Defining SNR: Variation Across Disciplines

Although a ubiquitous concept in many scientific and engineering fields, the exact definition of SNR varies between fields. In functional neuroscientific imaging particularly, SNR is often defined in a way at odds with what is common in the signal processing world. SNR is commonly understood as the ratio of the level of the signal of interest to the level of the noise in the measurement. Precisely defining what we mean by level, however, and how to report the ratio, is where different fields and different studies within fields start to vary.

The canonical signal processing definition of the SNR is given by [115]

$$\text{SNR}\_{\text{SP}} = \frac{P\_S}{P\_N}, \quad \text{where} \quad P\_{\text{x}} = \frac{1}{N} \sum\_{i=0}^{N} |\omega\_i|^2,\qquad(10)$$

where PS is the signal power, PN is the noise power, and Px defines the power of a discrete signal of length N, xi. Calculating this SNR for a functional imaging trace requires measuring or estimating a noise-free signal and the signal-less noise. This, however, can often be difficult or near impossible for many common functional imaging paradigms when there is no simultaneous electrophysiology. The functional imaging community therefore often reports a different SNR measure (sometimes called peak SNR or PSNR, not to be confused with the PSNR measure used in image processing [111]), defined as

$$\text{SNR}\_{\text{N}} = \frac{S}{\sigma}, \quad \text{where} \quad S = \frac{F - F\_0}{F\_0}, \quad \text{and} \quad \sigma^2 = \text{Var}(F\_0). \tag{11}$$

S is the amplitude of the fluorescence change during the signal of interest, such as an AP, and σ is the estimated RMS noise from a section of the time course without any signal, approximately equal to ffiffiffiffiffiffiffi PN <sup>p</sup> from Eq. 10. This approach is straightforward for most neuroscience signals, as the activity is often temporally sparse, facilitating selection of time course sections with and without activity. This SNR is often commonly reported as a simple ratio, whereas SNRSP is often reported in dB.

Fig. 6 Variation in SNR due to calculation method

These differences can result in different values for SNR for the same traces, as the figure above demonstrates. Increasing indicator sensitivity will scale SNRSP quadratically, while SNRN scales linearly due to using signal amplitude, not power. Second, and more importantly, SNRSP captures information about signal duration which SNRN ignores. For signals that are temporally sparse, SNRSP can seem surprisingly low compared to SNRN as the noise is spread throughout the whole trace, while the signal is concentrated into short periods (Fig. 6).

#### 2. The Fundamental Limit on SNR in Optical Imaging

The physical nature of photon detection limits the theoretical maximum SNR of functional optical imaging. We measure the fluorescence intensity, F, of an indicator to infer something about the underlying physiological process it reports. Poisson noise due to a collection of fluorescence photons dictates how well we can do this for a given number of photons collected. The Poisson distribution gives the probability of detecting k photons in an interval when the mean rate is F<sup>0</sup> as

$$P(F = k | F\_0) = \frac{e^{-F\_0} F\_0^k}{k!}. \tag{12}$$

Both the expected value and the variance of F are equal to F0. Traces are commonly normalized to the mean intensity, F0, enabling easier comparison of structures with a different labeling brightness, and the variance of the normalized variable, F/F0, can be simply calculated due to the linearity of the expectation value as σ <sup>2</sup> = 1/F0. We assume here that the change in brightness is small compared to the baseline brightness, such that the noise during and outside the signal period is

Fig. 7 Demonstration of the effect of Poisson noise on SNR. Reproduced from [88], CC BY-SA 4.0. Inspired by [31]

the same. Our normalized signal, S, is given by ΔF/F0, the change in fluorescence brightness, and so our signal-to-noise ratio is given by

$$SNR = \frac{S}{\sigma} = \frac{\Delta F}{F\_0} \sqrt{F\_0}.\tag{13}$$

The figure above illustrates this by drawing samples from a Poisson distribution to simulate fluorescent signals. The rate is increased by 10% in the central samples, and the baseline brightness is 10000 counts/sample in the top trace, and only 1000 counts/sample in the bottom. This leads to an SNR clearly increased by a factor of ≈ ffiffiffiffiffiffi <sup>10</sup> <sup>p</sup> <sup>≈</sup> 3 from the bottom to the top trace. On the right, a graph shows the theoretical maximum achievable SNR for a given brightness, for different relative changes in fluorescence from the signal of interest, ΔF/F0. This demonstration assumes our imaging system is Poisson noise-limited, which is typically true for bright fluorescent samples (Fig. 7).

#### Acknowledgements

This work was supported by the Biotechnology and Biological Sciences Research Council (BB/R009007/1), the Royal Academy of Engineering under the RAEng Research Fellowships scheme (RF1415/14/26), and the Engineering and Physical Sciences Research Council (EP/L016737/1).

#### References


range population dynamics of anatomically defined neocortical networks. eLife 5:2221


Opt Exp 12(1):288. https://doi.org/10. 1364/boe.403255


(2014) Independent optical excitation of distinct neural populations. Nature Methods 11(3):338


A (2017) Video rate volumetric Ca2+ imaging across cortex using seeded iterative demixing (SID) microscopy. Nature Methods 14(8):811–818


frontiersin.org/article/10.3389/fncel.201 9.00039


K (2015) SPED light sheet microscopy: fast mapping of biological system structure and function. Cell 163(7):1796–1806


Wetzstein G, Deisseroth K (2015) Extended field-of-view and increased-signal 3d holographic illumination with time-division multiplexing. Optics Express 23(25): 32573–32581


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Light-Based Neuronal Circuit Probing in Living Brains at High Resolution: Constraints and Layouts for Integrating Neuronal Activity Recording and Modulation in Three Dimensions

## Matteo Bruzzone, Enrico Chiarello, Andrea Maset, Aram Megighian, Claudia Lodovichi, and Marco dal Maschio

#### Abstract

Understanding how the brain orchestrates neuronal activity to finely produce and regulate behavior is an intriguing yet challenging task. In the last years, the progressive refinement of optical techniques and lightbased molecular tools allowed to start addressing open questions in cellular and systems neuroscience with unprecedented resolution and specificity. Currently, all-optical experimental protocols for simultaneous recording of the activity of large cell populations with the concurrent modulation of the firing rate at cellular resolution represent an invaluable tool. In this scenario, it is becoming everyday more evident the importance of sampling and probing the circuit mechanisms not just in a single plane, but extending the exploration to the entire volume containing the involved circuit components. Here, we focus on the design principles and the hardware architectures of all-optical approaches allowing for studying the neuronal dynamics at cellular resolution across a volume of the brain.

Key words Optogenetics, Computer-Generated Holography, Volumetric Neuronal Imaging, 3d photostimulation

#### 1 Introduction

Light-based approaches have emerged as a powerful tool to investigate the circuit organization and the functional mechanisms underlying the information processing in an intact brain [1]. This stems from the fact that using light, with respect to other investigation methods, allows recording the physiological variations, e.g., neuronal firing, membrane potential, and neurotransmitter release, with enhanced cellular specificity and high spatial resolution, ranging from extended circuits down to the subcellular compartments. These optical techniques rely on engineered fluorescent reporters like GECIs (Genetically Encoded Calcium Indicators) [2] or GEVI

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_3, © The Author(s) 2023

(Genetically Encoded Voltage Indicators) [3], whose absorption efficiency or fluorescence emission yield depends on calcium ion concentration or cell membrane potential. GCaMP is currently, by far, the most commonly used GECI in neuroscience. The molecular structure comprises a green fluorescent protein (EGFP) bound, through a M13 fragment of a Myosin Light Chain, to Calmodulin (CaM) [4]. CaM is a calcium-binding protein that, with the increase of intracellular calcium concentration following the firing of an action potential, induces a conformational change to the overall configuration resulting in an increase in the fluorescence intensity detected. In 2013, a family of ultrasensitive calcium reporters, GCaMP6, was introduced [5] and later improved with the family of GCamp7 [6]. These molecules are characterized by changes in fluorescence that can reach up to 1100% and decay times ranging from 950 to 250 milliseconds. In appropriate signalto-noise ratio conditions (SNR), these molecules enable the detection of a single AP with scores as high as 94%. Along with EGFPderived sensors, molecular variants with emission spectrum shifted toward the red region have been engineered [7, 8]. RCaMPs and R-GECOs (Red Genetically Encoded Calcium indicators for Optical imaging) are the main families of red reporters. They are obtained by subtituting the fluorescence reporter with mRuby and mApple, respectively. Compared to green indicators, typically excited at 920–930 nm and emitting in the 490–560 nm interval, RCaMPs are optimally excited around 1100 nm and emit around 590 nm. These reporters, in combination with optical approaches, enable the recording of the neuronal activity in the brain circuit of a living organism with cellular or even sub-cellular resolution.

From the circuit and systems neuroscience perspective, reconstructing cellular activity offers a powerful tool to look into brain mechanisms. However, the data obtained provide mostly correlative information, i.e., the activation of a certain neuronal sub-population occurring with the modification of a sensory or behavioral parameter [9]. Indeed, in this case, the possibility of validating the putative circuit mechanism and confirming its predictions on the expected dynamics of the system is still lacking. Ideally, for this purpose, one would like to record the change in the system activity when boundary conditions or network properties are artificially controlled.

The recent development and the constant improvement of light-activated modulators of the membrane potential offer, in this sense, an additional tool to intervene non-invasively and non-destructively into the circuits and probe the circuit mechanisms, i.e., to measure the circuit output under controlled conditions, at cellular resolution [10, 11]. Such tools can be genetically encoded to target designed neuronal populations with exogenous light-sensitive ion channels or pumps (termed "actuators" or "opsins") [12]. These all rely on a retinal, a polyene chromophore typically found in animal mechanisms of light conversion. When the retinal absorbs a photon, it isomerizes and originates a series of conformational changes that end up in the peculiar activity of these proteins, e.g., ion diffusion or transport across the cellular membrane. The physiological effect originating from light stimulation is a change in the concentration of different ions, ultimately resulting in the modulation of the polarization level of the cell membrane potential. Several opsins are used in neuroscience, and are grouped into two main classes, depending on the promotion or suppression of the firing of action potential induced by light. On one side, one can find actuators with depolarizing effects, like ChR2, C1V1, ChrimsonR, and Chronos, typically used to kick the membrane potential of excitable cells above the threshold for action potential firing. In certain cases, the temporal kinetics of the elicited photocurrent is sufficiently fast to allow control of the neuronal firing with submillisecond precision and repeated firing up to 100 Hz. On the other side, there are light-gated molecules with hyperpolarizing effects, like eNpHR and GtACRs, whose light-based activation moves the membrane potential in a range preventing action potential firing or strongly reducing its probability [13]. Depending on their structure and properties associated with the photocycle, these molecules present different photocurrent magnitudes, ion selectivity, photocurrent kinetics, and spectral sensitivity. All these parameters strongly impact either the kind of neuronal modulation achievable or the optimal method to obtain it in combination with concurrent activity recordings.

In this chapter, we describe the hardware components to probe 3D brain circuits at cellular resolution. We present currently reported light-based architectures for 3D imaging and 3D photostimulation based on multiphoton absorption. We highlight the most relevant aspects associated with the integration of these two components in the same experimental paradigm.

#### 2 Methods

#### 2.1 Molecular and Technical Constraints

In general, all-optical probing of brain circuits relies on the possibility of concurrently recording and modulating neuronal activity with high resolution [14]. This assumes the compresence of lightgated actuators and light-based activity reporters within the same preparation and, in some cases, within the same excitation volume. Then, combining these two families of tools together in the same experimental scenario requires evaluating the biochemical and biophysical properties of these molecules along with technical aspects and working constraints associated with their use.

#### 2.2 Absorption Characteristics and Its Impact

In the design of all-optical experimental paradigms, one aims at rendering the imaging and the photostimulation processes as much independent one from the other as possible: avoiding the imaging process to drive the activation of the opsin, hence altering the state of the expressing cells, and taking care that opsin photostimulation does not compromise the expected functionality of the activity reporter or the integrity of the signal extracted (see also Chap. 2). While it is possible to control many of the experimental parameters to keep such ideal working conditions, less straightforward is to engineer the biophysical and chemical properties of the molecules in use, that ultimately represent the first working constraint. Even considering molecule pairs maximizing orthogonality, the overlap between the absorption spectra is a condition frequently present, and it can result in important effects of "crosstalk" [15] (Fig. 1). Thus, the combination of an actuator with a reporter becomes

Fig. <sup>1</sup> Spectral properties of the most common actuator-reporter pairs, used for all-optical approaches. In the upper part are reported the light-driven actuators. The range with a 2P action spectrum greater than 60% is shown with a darkened mark indicating the wavelengths corresponding to the activation peak. On the right side, the corresponding <sup>τ</sup>OFF is reported. This indicates the photocurrent decay time at the offset of the illumination. In the lower part, excitation and emission spectra are presented for the most common activity reporters. Black lines highlight the actuator-reporter pairs reported [16–19]

possible at the condition of identifying suitable imaging or photostimulation parameters, e.g., wavelength, pixel dwell time, power density, and field of view, that minimize spurious activation of the actuator or reporter signal contamination [20–25]. The identification of a suitable molecular pair comes with the optimization of the optical parameter based on light sources available on the market. In general, the space of the optical parameters one can tune is sufficiently large, allowing the design of experimental protocols also with low orthogonality, where the extent of spectral overlap becomes considerable.

2.3 Photocurrent Integration and Spurious Opsin Activation In the design of an all-optical investigation protocol, one important aspect to consider is the mechanism of photocurrent integration at the cell membrane level and the impact the imaging process per se could have on the alteration of the network state [26]. This originates from the photocurrent generated from the simultaneous or quasi-simultaneous activation of multiple molecules that spatiotemporally add together their contributions. While photocurrent spatio-temporal integration is the key element for an effective photostimulation, this represents a constraint for imaging purposes, with opsins absorbing significantly at wavelengths used for imaging. Indeed, upon saturating excitation, net charge transfer across the membrane is proportional to the channel conductance and the opsin de-activation time constant, usually called τOFF, i.e., the time required after the light offset for the evoked current to return back to zero (Fig. 1). The value of τOFF can almost cover two orders of magnitude (from 1 to 2 ms for the fastest opsin, Chronos, to several tens of milliseconds in the case of C1V1 and ReachR) and can impact in the extent of opsin activation due to the imaging process [27]. It is known that, keeping all the other parameters constant, opsins with longer τOFF support more effectively a current integration process in the temporal domain, resulting in greater net charge transfer, potentially enhancing the effect of spurious opsin activation associated with the imaging process. Along with the imaging light power density at the focal position, it is then critical to tune other imaging acquisition parameters to render the total light dose sustainable and negligible the impact of the imaging process on the network state. Currently, this is achieved by sparsening the pixilation matrix of the images or equivalently extending the field of view and reducing the pixel dwell time or the line scan time. On the other side, this solution for limiting the spurious activation of the opsin impacts on SNR of the signal reporting the neuronal activity. It becomes, then, a matter of properly balanced expression levels of the molecules and experimental parameters to work on optimal conditions.

#### 2.4 Off-Target Activation and Somatic Opsin Targeting

The possibility of using all-optical approaches to probe brain circuits critically depends, along with recording the neuronal activity without perturbing the system, on the capability to target the neuronal modulation with high spatial precision and selectivity. This assumes that the photostimulation impacts exclusively or mostly on the identified targets, i.e., sets of neuronal somata. This is a goal not straightforward to achieve as dendrites and axons from many different cells surround the targeted cell body and constitute a dense mesh of neuronal processes, frequently expressing themselves the opsin. This represents a possible issue of off-target activation, i.e., indirect photostimulation of neuronal components other than those targeted. This is beyond the limit of the hardware design, independently from the specific optical implementation to drive the light-based actuator. To overcome this limitation, in the last years, many labs developed light-driven actuators specifically targeted to the cell soma, importantly reducing the expression along the axons and the other processes [28–30]. This frequently results in more efficient stimulation (in terms of required power density), and the strong reduction of indirect effects mediated by passing-by neuronal processes.

#### 3 Hardware Implementations for 3D Recordings of Neuronal Activity

There are many hardware layouts reported for recording neuronal activity at different depths within the sample, either with sequential scanning of diffraction-limited spots in raster or random schemes either with longitudinally or laterally extended excitation profiles [1, 31–33] (see also Chap. 10). Ideal methods may depend on different factors, like the light scattering, the labeling sparseness of the sample, and ultimately on the question addressed. Here we focus on the general architecture implementing 3D raster-scanning of diffraction-limited multiphoton excitation, as this is currently the method offering the highest flexibility [34]. Three main components generally characterize this scheme, in some cases partially overlapping (Fig. 2): a first one, starting with the source and including all the optical elements required for intensity modulation and for the spatial conditioning of the beam; a second one devoted to longitudinal scanning of the excitation spot along the light propagation direction (Z); and a third one, for the deflection of the excitation beam in the lateral direction (XY).

Ti:Sapphire-based sources are typically adopted for optical recordings of neuronal activity. Fluorescence emission from the activity reporter is excited deep in the brain tissue via multiphoton absorption of 180–250 fs light pulses in the IR range of the spectrum and delivered at the sample at a repetition rate of 40–100 MHz. The fluorescence emitted is detected with photomultiplier tubes and solid-state detectors [35], resulting in

Fig. 2 General hardware layout for 3D imaging. This typically includes three elements: the first one with the source and the intensity modulation unit, a component of the optical path designed for scanning the beam along the longitudinal direction (in blue), and a module for scanning the excitation beam along the lateral dimension (in red). The different elements are conjugated by means of relay optics

functional signals with a high SNR, as far as a sufficient photon density of the ballistic excitation component is preserved, and the scattered component does not contaminate the image background level. A Pockels cell is usually used as an intensity modulation unit to finely and rapidly control the imaging beam power at the sample. As the beam diameter is typically below 2 mm at this stage, optical elements for expanding the beam size are inserted, considering the target size at the level of the objective pupil and the magnification realized by the combination of the scan lens and the tube lens.

The more common design for functional recordings across a 3D brain volume is based on optical modules or elements enabling the imaging beam to scan the sample along the light propagation direction (longitudinal, Z). This has been traditionally implemented with the mechanical movement of the imaging objective by means of a piezoelectric actuator [36]. More recently a series of approaches, prevalently based on the control of the curvature of the light wavefront at the Back Focal Plane (BFP) of the objective, have been refined to remotely control the effective focal position while keeping the objective still [31]. To reconstruct the activity, the beam is deflected sequentially to different positions of the field of view. To achieve the lateral scanning within the region of interest, imaging systems adopt a pair of galvanometric mirrors or acoustooptic deflectors.

3.1 Remote Configurations for 3D Beam Scanning In general, scanning a diffraction-limited beam in a volume of the sample relies on the possibility to modulate the light wavefront at the BFP of a lens or an objective. Indeed, superimposing a spatial gradient or a curvature into the phase of the electric field of a planar light wavefront leads upon propagation, respectively, to a lateral or a longitudinal offset of the beam focus at the Frontal Focal Plane, FFP (Fig. 5). These two planes are said to be Fourier conjugated, and the light distribution at the FFP represents the results of the diffraction of the wavefront modulated at the BFP. To easily impose such modulation of the light wavefront, the BFP of the objective is optically conjugated using relay optics to a remote plane, where the element for introducing the wavefront modulation can be more conveniently placed. This is, for instance, the design realized by the tube lens-scan lens pair and the galvanometric scanner. If more than one modulator has to be employed, e.g., to impose both a vertical and horizontal tilt and a curvature, then either all the modulators are placed very close together near the focal plane of the scan lens, or they are distributed in multiple places and conjugated by additional relay optics to the BFP of the objective.

3.2 Methods for Scanning the Sample Along the Lateral Dimension

The most common way for scanning the sample along the lateral dimension relies on a pair of galvanometric mirrors, conjugated to the BFP of the objective. These are 3–6 mm wide mirrors mounted with an orthogonal optical axis whose orientations can be tuned by proper control signals. A change in the orientation with respect to the propagation of the incoming beam introduces a linear phase gradient to the wavefront, resulting in the lateral offset of the beam with respect to the center of the field of view (FOV). In a raster scanning scheme, one mirror sweeps over a line, and the other jumps from one line to the following one. With a typical pixel dwell time of about 4 μs, this design results in a frame acquisition time lower than a second for a mesh of 512 512 pixels. An 8–12 kHz resonant galvanometric mirror can replace the line scanning galvo, reducing the line acquisition time to 62–41 μs, respectively, and increasing the acquisition rates up to 30–45 frames per second for a pixel matrix of 512 512 elements. In this second scenario, there is no direct possibility to adjust the pixel dwell time and its extremely short value, typically 120–80 ns, heavily impacts the number of collected photons per pixel and, ultimately, the image SNR. In parallel to the solutions described above, to steer the light beams, it is possible to modulate the light wavefront in an inertia-free approach using acousto-optic deflectors (AODs) [37, 38]. These are active optical elements with a crystal window bonded to a piezoelectric transducer. When this is driven by an electrical signal at high frequency, it induces an acoustic wave traveling along the crystal in a particular direction. Based on the photoelastic effect, this generates a diffraction grating so that an optical beam propagating through the crystal in the direction orthogonal to the acoustic wave experiences an angular deviation that is proportional to the driving frequency. A pair of consecutive and orthogonal AODs can be used as a scanner to tilt the light wavefront in both the lateral axes. AOD-based scanners are characterized by bandwidths between 30 and 40 kHz, corresponding to a typical beam resetting time between 15 and 25 μs that accounts for the time needed for the acoustic wave to cover the distance corresponding to the effective optical window. While in a random access scheme, these numbers allow for extremely high and unparalleled temporal resolution in the recordings, the acquisition rates in full-frame raster scanning mode can be comparable with those achieved with a resonant scanner, i.e., 30 Hz. Diffraction efficiency and group delay dispersion are two other important figures that a design with an AOD scanner should deal with to obtain good imaging performances [39].

#### 3.3 Methods for Scanning the Sample Along the Longitudinal Dimension

Adjusting remotely the longitudinal position of an excitation spot generally relies on the possibility to impose a curvature on the wavefront profile in a plane optically conjugated to the BFP of the objective. Considering the sole imaging purposes, possibly the simplest implementation is to place a lens with a controllable focal length, either at the level of the BFP downstream of the XY scanner and the scan lens-tube lens pair or in a plane optically conjugated to the BFP, upstream the scanner. An electrically tunable lens (ETL) is an example of such a device, which is composed of a liquid volume enclosed between a glass and an elastic polymer membrane, i.e., effectively a plano-convex liquid lens [40]. An electromagnetic coil driven by an electric current exerts pressure on the liquid, increasing the membrane curvature and thus the lens focal power. ETLs are capable of relatively fast settling times, typically between 5 and 15 ms, in response to a step-like control signal and can easily reach, depending on the objective properties, a few hundreds of microns of focal range with limited distortion of the point spread function (PSF) (Fig. 3). ETL driving, either with staircase-like or sawtooth control signals, in coordination with the frame acquisition allows for the plane-after-plane acquisition of a volume. According to a similar scheme, a tunable acoustic gradient index of refraction lens (or TAG lens) can be used to quickly scan the sample longitudinal dimension [41, 42]. In this case, high-frequency driving of a piezoelectric actuator generates in a viscous medium a cylindrical acoustic wave whose constructive interference produces periodic changes in medium density and, consequently, in the lens optical power. The resonance frequency of these devices, typically a few hundreds

Fig. 3 Multiphoton imaging across multiple planes in the olfactory bulb of a mouse expressing GCaMP6f. A basic configuration for multiphoton imaging is complemented with an electrically tunable lens to sample quasi-simultaneously different planes of the brain volume. Four different planes are shown with activity profiles extracted from the corresponding cells

of kHz, results in axial scanning times in the order of a few microseconds. In this case, volume information is acquired not plane-byplane moving along the longitudinal axis but section-by-section aligned to it. A critical parameter to evaluate for the positioning of an ETL or a TAG lens is the effective aperture of the optical window with respect to beam diameter at the designed position along the optical train.

Spatial light modulators (SLMs) based on a matrix of nematic liquid crystal represent a valid alternative for imposing a curvature profile [43–45]. These optical elements use the birefringent properties of the liquid crystal molecules to tune with a control voltage the effective refractive index of the individual cells within the matrix and control the phase delay of the different spatial components of a propagating wavefront [46–50]. These devices can encode relatively large wavefront curvature leveraging the large number of pixels and the possibility of phase folding, resembling the scheme of a Fresnel lens. For these systems, liquid crystal relaxation times pose the major restriction to the refresh rate of the diffractive optical element encoding the phase correction. Still, it can reach 300–400 Hz, allowing resettling times close to 2.5–5 ms. Faster switching times can be obtained by moving to deformable or tunable mirrors (DM) [51]. Indeed, these later devices can achieve currently important maximal phase strokes with sub-millisecond resettling times. As SLMs and DMs are more conveniently operating in reflection mode, the preferred position of these remote re-focusing devices is upstream of the XY scanner, where also more space is available for beam conditioning according to the optical window and the working mode of these devices. With respect to ETL or TAG lens, an advantage of SLMs and DMs is that, along with the wavefront correction required for the defocus, it becomes possible to superimpose additional phase corrections to the wavefront to compensate for aberrations induced by other optical elements, including the objective and the sample [ ]. If a faster longitudinal scanning speed is required, a variable curvature to the wavefront can be added with an acousto-optic lens, which is implemented with a combination of the same acousto-optic deflectors (AOD) used for lateral scanning, as described previously [ 44 When the acousto-optic deflector is driven with a frequency ramp, the acoustic wave renders a gradient of local spatial frequency which locally deviates the wave front by different amounts, resulting in a focusing effect, with a focal length being proportional to the rate at which the frequency changes. Since the frequency changes over time, the position of the focus is not steady but it drifts laterally with a constant speed. Employing a pair of consecutive AODs, driven with the same frequency ramp but with acoustic waves traveling in opposite directions, results in two drifts that cancel each other out. Another way of longitudinally shifting the excitation spot is by conjugating the focal plane of the imaging objective with the focal plane of another "remote" objective, where a fast translating mirror is placed in the FFP The beam injected in this z-focusing arm enters the remote objective and gets focused in a spot at its (fixed) focal plane. When it is then reflected back by the mirror, it behaves as a point source which then gets focused by the other objective at the sample, where an image of it is formed. When the BFP of the remote objective is conjugated to the XY scanner and to the BFP of the main objective, longitudinally shifting the remote mirror shifts the actual position of the point source and consequently shifts the position of its image at the sample plane. This configuration leveraging the lightweight mirror and the availability of accurate piezo actuators can achieve settling time in the order of 1 ms and travel ranges of several hundreds of micrometers. To further extend the bandwidth limits of the scanning process, solutions based on temporal multiplexing have been proposed to sample non-simultaneously points at different z-planes without actually changing the setting of any optical components [ This approach relies on the fact that the fluorescence decay time of the reporter, typically 2–4 ns, is shorter than the interpulse interval of the excitation sources, usually 10–12 ns. In these conditions, an optical module can be designed to split the original beam into a n-number of components which are temporally delayed one from the other with an amount of time (0, τfl, 2 ∙ τfl,.., n ∙ τfl, with. . , n ∙ τfl interpulse interval) covering the fluorescence decay and are corrected with different degrees of wavefront divergence. Upon recombination of the components, this results in a compound beam encoding a set of n different curvatures of the beam wavefront, each addressing a different z-planes in a different time 56]. [53–55]. 39, 52].

window corresponding to the delay introduced. Temporal demultiplexing of the fluorescence signal with fast acquisition electronics according to the same interleaving scheme allows to reconstruct the neuronal activity from different planes with minimal interplane interference.

#### 4 Hardware Implementations for 3D Modulation of Neuronal Activity

The general architecture of an optical train for high-resolution 3D photostimulation is designed after the principle of obtaining the sufficient integration of the light-induced charge transfer across the membrane. This results in layouts formed by two components: a first one, starting with the source and including all the optical elements required for beam intensity modulation and for the spatial conditioning of the beam; a second one, downstream to the first one, integrating optical components and systems for the generation of the arbitrary distributions of the electric field at the sample (Fig. 4). The scheme for the first part envisages the use of pulsed and high peak energy laser sources to activate light-gated actuators via a multiphoton absorption process. This is instrumental to access components of the neuronal circuit laying in the deep layers of a scattering preparation and to contain the spatial profile of excitation along the light propagation direction, taking advantage of the non-linear dependence of the multiphoton absorption on the excitation power. Along with traditional Ti:Sapphire, more recently for photostimulation are preferred laser sources based on oscillators and amplifiers with high pulse energy, up to 100 μJ range, and low repetition rate, 500 kHz – 5 MHz [26, 57, 58]. Indeed, for the same average power at the sample, the probability of multiphoton absorption scales with the inverse of the laser emission duty cycle, allowing, with an increase in the available pulse photon density, a more efficient activation of the molecules at saturation and targeting larger populations of cells. The basic optical train for the first part includes, along with a light shuttering module, a stage for the modulation of the source intensity. This is typically based on passive optical components, like the combination of the half-wave plate with a Polarizing Beam Splitter (PBS), or active elements based on electro-optic or opto-acoustic mechanisms, such as Pockels cell or Acousto-optic Modulators (AOM), respectively. The final stage of this section includes the optical components for the conditioning of the laser beam to match the constraints of the optical elements for the generation of the intensity distribution in the following section. It usually includes a set of lenses in a Galilean telescope configuration to set the beam diameter and the beam divergence level and waveplate for the rotation of the direction of the polarization of the light according to the requirements of the downstream optical elements.

Fig. 4 Scheme representing some of the hardware configurations for photostimulation in two- or threedimensions. Starting from the left side, we report the solution for sequential photostimulation with diffractionlimited spot with spatially extended photostimulation patches (in gray). Then, the parallel configurations are indicated, relaying on scanning of multiplexed beamlets (yellow), scanless simultaneous excitation of multiple patches without (green), and with temporal focusing (red)

4.1 The General Features of a Photostimulation Train The core components for 3D light-assisted modulation of neuronal circuits are optical configurations engineered to generate arbitrary light distributions in the sample space and are generally grouped into two main classes [26]: sequential and parallel excitation approaches (Fig. 4). In the first design, a single beam, either diffraction-limited or with an engineered PSF, quickly travels across the investigation volume to excite sequentially multiple regions of interest, corresponding for instance to a designated subset of neurons or portions of the corresponding regions. In the second scenario, the wavefront of the original beam is engineered to result in the sample in arbitrary distributions of multiple beamlets, each of these exciting at the same time a different cell within the targeted subset. In the next sessions, we describe currently reported layouts for both the photostimulation approaches and the critical parameters corresponding.

4.2 Sequential 3D Photostimulation In sequential or single-spot cell-resolution photostimulation, excitation light is focused on the sample in a single spot at a time. This spot can range from a diffraction-limited excitation volume scanned sequentially over portions of the cell membrane or a single, extended illumination profile tailored around the typical size of the cell soma. Steering the beam along the lateral direction and re-directing the excitation spot is then instrumental either to reach sufficient photocurrent integration or to target multiple cells. This design can be implemented with a pair of orthogonal galvanometric mirrors or acousto-optic deflectors conjugated to the objective Back Focal Plane (BFP) by a scan lens plus tube lens pair [59–61]. Both these devices offer sufficient bandwidths to shift in a few microseconds the excitation spot within the same cell body along a spiral or raster trajectory and/or to jump from one cell to the other of the identified subset. In principle, with the coordinated control of the light intensity, the power could be distributed in an arbitrary spatial pattern, and a group of cells could be stimulated quasisimultaneously with cell-matched excitation power density. As the size of the excitation spot dictates the number of light-gated molecules recruited, the same parameter ultimately impacts the integration time Ti, i.e., the total amount of time that the excitation spot is addressed to a certain cell. Ti typically ranges within 1–15 milliseconds, and its optimization – depending on the cell membrane time constant, the light power density, and the density of lightgated molecules – is critical to elicit the expected physiological effect. Extension of the sequential paradigms in three dimensions has only been partially reported at the moment. Optically conjugating an upstream electrically tunable lens (ETL) to a galvanometric scanner, makes it possible to realize a 3D point scanning manipulation arm with 6–15 milliseconds of typical commutation time to shift the beam from one plane to the other [62]. This figure could be substantially improved to a few tens of microseconds considering a design relying on acousto-optic deflectors to achieve a shift of the beam along the longitudinal direction, either as a standalone z-module or integrated into a 3D acousto-optic lens. 3D-2P-AOD systems with random access technology designed for functional imaging are currently becoming commercially available. These should be, in principle, capable of also supporting a sequential 3D photostimulation paradigm, but so far, there is no related report.

Moving from scanning a diffraction-limited excitation volume to scanning spatially extended excitation is an effective strategy to enhance the photocurrent integration in the spatial domain and ultimately improve the photostimulation bandwidth. Extending the excitation volume is typically achieved using Gaussian beams with reduced or low Numerical Aperture (NA) [61]. This solution comes with the elongation of the excitation profile along the light propagation direction that scales with the effective lateral size of the excitation spot and that can quickly exceed substantially the typical cell diameter. A way to limit the extension of the excitation spot along the light propagation direction independently from the shape lateral size is to integrate into the optical path an arm for Temporal Focusing upstream of the beam XY scanner [25, 63]. This technique relies on a diffractive element, typically a grating, to introduce a position-dependent delay into the diffracted spectral components of the incoming light pulse. This leads to the temporal stretching of the pulse envelope everywhere along the optical path but the focal position, where the delayed components reconstitute the original intensity distribution as a superposition of different beamlets (see also Chaps. 1 and 9 of this volume). SLMs and ETL, optically conjugated upstream of a galvanometric scanner, can be used to control the position of photostimulation in 3D, with typical repositioning times ranging from 3 to 15 milliseconds. In this case, the resulting modulation of the wavefront at the BFP of the objective would account for a first component associated with a lateral offset introduced at the level of the galvanometric mirrors and a second component for the axial offset introduced by the SLM/ETL.

Parallel photostimulation approaches are based on the possibility to generate simultaneously a set of beamlets, each targeting a different position in the sample volume. Most of the current strategies for 3D light shaping rely on Computer Generated Holography (CGH, Fig. 5) [46–48, 64–66], a powerful technique to achieve patterned illumination at the sample plane through phase modulation of the laser beam in a plane conjugated to the BFP of the objective.

As described above, in the case of controlling the focal position using SLMs, in CGH it is possible to impose spatial maps of phase corrections on the light wavefront to render the desired distribution of the excitation. Basic phase modulation patterns imposed at the objective BFP, like linear phase gradients (x, y) and parabolic phase profiles (z), resemble the effects of a combination of prisms and lenses and allow to split the original beam at different x, y, z coordinates. Super-imposing these individual correction maps in one single phase map, also called Diffractive Optical Element (DOE), operates like an optical multiplexer that can encode alone for hundreds of points in a designed 3D geometry at the sample volume. DOEs can be superimposed to a propagating beam either

#### 4.3 Parallel 3D Photostimulation

Fig. 5 Computer-Generated Holography. Engineering the light wavefront at the BFP allows for rendering arbitrary light intensity distributions at the sample. Introducing a phase correction resembling a Fresnel lens results in a change in the convergence properties of a propagating beam, moving the position of the focus longitudinally. Similarly, with a linear gradient of phase delay applied at the BFP, the position of the focus is moved laterally. Multiplexing different diffractive optical elements (DOEs) allows for rendering arbitrary light distribution at the sample

with static phase masks engraved in glass or quartz material or using SLMs to dynamically update the light distribution. One important aspect of CGH is its versatility. An optical path based on this approach can be used as a stand-alone photostimulation module (3D-CGH) [20, 67] or, in combination with other components, to integrate the multiplexing capabilities as a stage in more extended photostimulation optical trains (MTF-CGH [68], 3D-SHOT [63] and 3D-CGH spiral [69]). In the first scenario (3D-CGH), the diffractive optical elements encode along with the positions of the foci in the sample volume also for their actual lateral shape. Indeed, iterative algorithms based on Fourier transforms can compute phase maps to render a 3D distribution of bidimensional illumination profiles tailored independently around the specific structures of interest in the sample. These excitation foci can range from ensembles of dots, patches of stimulation of any arbitrary geometries or their combinations. One aspect associated with 3D CGH is that, while the lateral (XY) dimension of the excitation spots, supporting the process of photocurrent integration, can be imposed simply by specifying the desired intensity mask, the longitudinal (Z) extension of the illumination profile, being dictated by the laws of diffraction, scales linearly with the lateral size of the rendered shape. Depending on the experimental conditions, in particular the characteristic cell size, sparseness of the light-gated actuator expression, or its cellular localization, this feature can impact the effective resolution achievable in targeting neuromodulation. For neuronal modulation with 3D-CGH, it is then important to identify a trade-off between the acceptable spatial resolution and the photostimulation patch size to maximize the photocurrent integration area. In order to relax these constraints and to extend the performances and efficiency of the 2P-based photostimulation, a few approaches have been developed, combining the power of the 3D-CGH approach with other optical components. In one of the first approaches reported, the beam multiplexing capability of CGH is combined with the spiral scanning based on downstream a galvanometric scanner [24]. In this implementation, similar to the sequential approach described above, diffraction-limited spots are quickly scanned with the same coordinated trajectory over different cells. The objective wavefront at the BFP in this scheme accounts for a fixed component, a DOE imposed with the SLM and encoding a distribution of foci centered on the cell bodies, plus a timevarying component encoded by the galvo pair to scan the spiral over the designated cells. As far as the power density is kept under control, this approach can offer the best longitudinal confinement of the photostimulation pattern, corresponding to the PSF extension of a diffraction-limited spot. From the implementation with 2D parallel excitation, this approach can be extended in 3D, taking advantage of control of the third dimension allowed by CGH [59]. In alternative to the 3D-CGH spiral scanning, beam multiplexing supported by phase modulation has been combined with temporal focusing. Independently from the particular type of implementation, the general scheme consists of a step for shaping the beam amplitude, followed by a path with a dispersive element for Temporal Focusing (TF) [63], and finally completed with a stage for the spatial multiplexing with multipoint CGH based on a SLM. Shaping of the beam amplitude can be obtained either directly with a low-NA Gaussian beam to get circular profiles [70] or using CGH, Generalized Phase Contrast (GPC), and other amplitude modulation methods for tailoring the illumination patterns with top hat profiles [68] (see also Chap. 1). Most frequently the beam shaping is used to generate a single intensity profile at the level of the dispersive element for TF, resulting upon CGH-based multiplexing in the rendering of a set of exact replicas of the original shape. However, it is possible, by properly tiling and aligning the optical window of the beam shaper and of the multiplexer, to generate groups of replicas of different shapes [67, 68].

#### 5 Setting Up an All-Optical 3D Investigation System

Combining light-based recording and modulation of neuronal activity at high resolution is, in general, a delicate task. One should identify and validate the experimental approach depending on the tool's molecular properties and available techniques' capabilities. Even if this is rarely the case, the optimal hardware configuration definition should ideally come toward the end of a more extended development pipeline. Identifying suitable light-based molecules is the first phase in this kind of pipeline. Indeed, one should first characterize the effective functionality and operativity ranges of the light-based molecules, e.g., the molecules' action spectrum, the SNR ratio of the signal, or the change in membrane potential induced with the excitation power. Even though most of these characteristics could be found reported in the literature, these features should be verified with the actual working conditions/ preparations of the planned experiments. After the initial phase for the characterization of the intrinsic properties of the approach (e.g., working parameters for photostimulation), in the following phase, one should ideally evaluate its compatibility with the typical working conditions of the second concomitant approach, e.g., GCaMP6s imaging. In particular, it should be assessed whether the first approach perturbates the state and/or the functionality of the molecules and the optimal working conditions required for the second one. Here is typically where the impact of the molecule crosstalk can be characterized and the effective capability of the techniques estimated. Unsurprisingly, one will have to deal with a set of experimental tradeoffs, gauging, for instance, between the SNR of the recordings and the level of the spurious activation or between the light power density of the photostimulation and the acceptable signal contamination level due to the photostimulation picked-up in the activity recordings. In this challenging task, even if not strictly related to the hardware configurations or the molecular properties, the optimization of the data acquisition chains and the design of algorithms to filter and clean the recorded signals are tools available to potentially relax the working conditions, at least partially.

5.1 The Hardware Integration Different can be the optical configurations to support an all-optical circuit probing framework. It is clear that integrating and coordinating two components requires the identifying the appropriate hardware solutions and the proper software capabilities. From the point of view of the hardware integration, the photostimulation train and the imaging train integrating the 3D scanning, once a valid reporter-actuator pair is identified, can be considered as two components, mostly independent: different sources, beam size diameters, intensity modulation units, and working wavelengths. The two components, on the other side, become dependent one from the other when reaching the point of combining the two excitation beams along a common segment of the optical path to finally reach the objective back aperture. How and where multiplexing or combining the two optical trains are two important aspects to consider. Typically, the solutions adopted take advantage of either the polarization state of the two beams or their spectral separation. In the first scenario, the seeding beams, typically leaving the source with horizontal polarization, are constrained and routed in such a way to arrive at the point of combination with two linear and orthogonal polarization states, one horizontally and one vertically aligned to the respective propagation directions. At that point, a polarizing beam splitter, properly oriented to reflect one of the beams and transmit the other, acts as a beam combiner to launch them along the same path (see Note 1). Alternatively, when flexibility in the excitation wavelength is not required, a proper dichroic mirror, either long-pass or short-pass, can serve as a beam combining element. Working with a defined wavelength for the imaging beam and the photostimulation beam allows for reducing the optical components required for conditioning the beams and, so, minimizing undesired reflections. In general, defining the beam combining architecture is not just a matter of the optical component to use but also of identifying a convenient position along the optical train where joining the two components. This should be evaluated in terms of the available space, the minimization of the introduced distortion in the light wavefront, and the constraints dictated by the focal distances of the corresponding optical paths. It is convenient to illustrate possible integration layouts, to consider the most general scenario of an imaging system, presenting a XY scanning assembly based on a pair of galvanometric mirrors, followed by the scan lens-tube lens pair (Fig ). The most frequent design envisages that the imaging and the photostimulation paths run independently and merge downstream the XY scanner. This, combined with a remote system for z-scanning the imaging spot, allows the uncoupling of the two paths, where the imaging z-scanning assembly is positioned upstream of the XY scanner and the photostimulation fed into the train downstream to it. An element for combining imaging and photostimulation beams can be placed in three different positions: upstream (1), in between (2), and downstream the scan lens–tube lens pair (3) (Fig. . While it could be hard to precisely predict the impact of such element on the propagation characteristic of the imaging beam, it is important to 6) . 6

Fig. 6 Layout of the integration scheme of the imaging and photostimulation arms. In cerulean is indicated the path of the imaging beam with the assembly to control the z-position upstream of the XY scanner. In red are shown the possible insertion points of the photostimulation path with respect to the optical components required for imaging

consider the degree of flexibility associated with these three solutions. Indeed, moving from solution (1) to (3), one gains progressively more freedom to conjugate the photostimulation modulation plane to the BFP. While with design (1) the photostimulation beam is ultimately conjugated to the BFP using as relay optics the scan lens–tube lens pair, in (3) one has all the flexibility to design the relay component to the BFP of the objective independently from the imaging path and potentially to accomodate the requirements of the photostimulation train, independently by its complexity.

5.2 Beam Coregistration Procedures The first aspect to consider is evaluating how the three light beams used for the two approaches (excitation of the light-gated actuator, excitation of the fluorescent reporter, and fluorescence emission by the reporter) are reaching or leaving the sample, whether these beams travel through the same objective, and whether the objective is kept still during the acquisition. This is usually the most common case [20, 23, 54, 69] but are also possible architectures relying on independent arms, one for photostimulation and fluorescence collection with a moving objective, the other for reporter excitation [62]. Having one objective still assumes the use of remote optical components for 3D control of both the excitation beams. Remote control is instrumental in compensating for subtle differences in the beam divergence characteristics of the different paths or chromatic aberrations, and it facilitates the co-registration of the excitation beams in the sample volume and the synchronization of the control signals. For co-registration of the excitation beams, it is generally considered the essential procedure allowing the calculation of the affine transformations mapping the reference system for the photostimulation into the reference system for the imaging. The goal is to obtain a precise transformation that converts the position information X, Y, and Z in the sample space into the corresponding commands for targeting the photostimulation beam and the imaging beam to that precise point. Typically, this relies on the use of a detection arm equipped with a CMOS/CCD camera and is achieved with the sequential illumination of a series of points in a 3D lattice, first with the imaging beam and then with the photostimulation beam, while moving the objective to bring the current excitation point in focus at the camera plane. Alternatively, without using a camera but relying on the PMTs, it is possible to use photobleaching of a fluorescent sample to measure the positions of the points within the lattice (see Note 2).

In many optical systems, it is frequent to experience a progressive degradation of the optical performances depending on the spatial distance from the center of the field of view. This appears, for instance, as an increase in the dimensions of the PSF for the imaging path or a decrease in the effective light power density for the photostimulation approach. This usually originates from a non-uniform diffraction efficiency of the optical elements. On the other side, mapping such non-uniformities is required to identify of the effective volume addressable with the imaging and the photostimulation beams under the constraints of the resolution, SNR ratio requirements, and working conditions identified in the previous phases. This typically requires a procedure similar to the one used for beam co-registration but refined in order to extract, along with the XYZ positions of the points in the lattice space, the change in the intensity, the eventual deformation of the excitation profiles, and the presence of possible aberrations. It is important to note that in a certain measure, it is possible to develop corrective strategies and partially compensate for the optical components' limitations. This is normal for the diffraction efficiency of a SLM when included in the optical arm for photostimulation based. Because of the low-pass filtering effects originating from the SLM working principle [46], the efficiency curve measured at the sample shows a rapid decrease with the increasing distance from the center of the optical system (see Note 3).

#### 5.3 Spatial Uniformity and Addressable Field of View

#### 6 Notes


#### 7 Conclusions

In this chapter, we described the current state of the art of the hardware configurations allowing all-optical investigation of the neuronal circuits in vivo in three dimensions. This is a field where the development of molecular tools and the technological advancement continuously provide novel possibilities for designing and refining experimental approaches that can extend the perspective of investigating the neuronal dynamics in living organisms. Despite the availability of technical solutions, the community has only partially capitalized on these tools to explore brain mechanisms. This is indeed not just a matter of the hardware required, but poses a series of questions, and challenges, also from the point of view of the design of the experimental protocol, the analysis of the data, and the interpretation of the outcome.

#### Acknowledgments

The authors would like to acknowledge the support of the Department of Biomedical Sciences (SID\_DalMaschio2018) and the Padua Neuroscience Center (ReTurnPD) at the University of Padua, the support of EC Research Programs (VISGEN and FLAMMES). The authors would like to thank the colleagues providing comments and suggestions to the draft versions.

#### References


Nat Commun 9:4125. https://doi.org/10. 1038/s41467-018-06511-8


imaging. Neuroscience. https://doi.org/10. 1101/736124


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International Licens , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. e (http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Chapter 4

## High-Speed All-Optical Neural Interfaces with 3D Temporally Focused Holography

### Ian Anto´n Oldenburg, Hayley Anne Bounds, and Nicolas C. Pe´gard

#### Abstract

Understanding brain function requires technologies that can monitor and manipulate neural activity with cellular resolution and millisecond precision in three dimensions across large volumes. These technologies are best designed using interdisciplinary approaches combining optical techniques with reporters and modulators of neural activity. While advances can be made by separately improving optical resolution or opsin effectiveness, optimizing both systems together matches the strengths and constraints of different approaches to create a solution optimized for the needs of neuroscientists. To achieve this goal, we first developed a new multiphoton photoexcitation method, termed 3D-Scanless Holographic Optogenetics with Temporal focusing (3D-SHOT), that enables simultaneous photoactivation of arbitrary sets of neurons in 3D. Our technique uses point-cloud holography to place multiple copies of a temporally focused disc, matched to the dimensions of a neuron's cell body, anywhere within the operating volume of the microscope. However, since improved placement of light, on its own, is not sufficient to allow precise control of neural firing patterns, we also developed and tested optogenetic actuators ST-ChroME and ST-eGtACR1 that fully leverage the new experimental capabilities of 3D-SHOT. The synergy of fast opsins matched with our technology allows reliable, precisely timed control of evoked action potentials and enables on-demand read-write operations with unprecedented precision. In this chapter, we review the steps necessary to implement 3D-SHOT and provide a guide to selecting ideal opsins that will work with it. Such collaborative, interdisciplinary approaches will be essential to develop the experimental capabilities needed to gain causal insight into the fundamental principles of the neural code underlying perception and behavior.

Key words Optogenetics, Temporal focusing, 3D holography, ChroME, Brain-machine interfacing, Multiphoton, 3D-SHOT, Soma-targeted, Opsin

#### 1 Introduction

Although optogenetics have become a mainstay of neuroscience research, used to probe causal relationships between circuit activity and behavior [1–6], it is only recently that multiphoton optogenetic techniques have been used to modulate neural activity. Numerous technical advances in optics [7–10] and opsins [11– 15] over the last decade have led to an increase in usage and

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_4, © The Author(s) 2023

adoption of multiphoton optogenetic strategies. Multiphoton optogenetics have been used to examine visual discrimination [3, 14], coding features of detection [16, 17], ensemble activity [18], cortical circuitry [19], and more [20, 21].

In this chapter, we will briefly review the state of the field and introduce 3D-SHOT [7]. We will detail the steps required to build, align, calibrate, and validate 3D-SHOT as an add-on on the light path of a 2-photon microscope. We will then discuss how to assess and select ideal opsin and reporter combinations for use with a 3D-SHOT system.

Optogenetic techniques have been rapidly and widely adopted in neuroscience research because they enable precise and reversible external control of neural activity with high temporal precision by means of minimally invasive optical signals. However, the spatial precision is generally too poor to manipulate individual neurons because light does not propagate well through dense brain tissue. The vast majority of optogenetics studies primarily leverage genetic specificity rather than spatial control. However, since many neural computations and behaviors rely on populations of neurons that are genetically similar but spatially intermixed [18, 22–24], precise targeting of individual neurons with optical methods is necessary. Two-photon activation of opsins is an attractive approach for improving spatial resolution. The longer wavelengths used in two-photon excitation are less affected by optical scattering [25], which dramatically improves the axial resolution and the accessible depth of sculpted illumination patterns [26, 27]. Further, two-photon absorption is a nonlinear effect which further restricts opsin excitation to a narrow axial plane [8, 28].

Most biological studies using 2-photon optogenetics have used scanning-based approaches [3, 14, 18–21]. Similar in principle to two-photon imaging, a femtosecond-pulsed infrared laser beam is focused into a single diffraction-limited spot which is scanned in 2D or 3D, with galvo-mirrors [27], acousto-optic modulators [29, 30], spatial light modulators (SLM) [31], or microelectromechanical systems (MEMS) [32]. A raster [8, 13] or spiral [18, 28, 33] pattern is scanned across the soma to target neurons in 3D. These techniques are power efficient [34] but require opsins that are either extremely strong or slow to deactivate (ideally both), so that the photocurrent can accumulate as the spot is scanned across the neuron soma, usually at the expense of temporal resolution [11].

To improve temporal precision, whole-cell illumination techniques that forgo scanning have been developed to simultaneously illuminate the entire cell soma with a larger spot and activate all the opsin at once. Some approaches achieve whole-cell activation with low NA objective [13, 20] at the expense of spatial resolution, but the preferred method relies on computer-generated holography (CGH) [35–37] with a spatial light modulator (SLM) to synthesize custom illumination patterns that are matched to the shape of individual neurons (see also Chaps. 3 and 11). Compared to scanning approaches, whole-cell activation with CGH enables faster responses to optogenetic stimulation, but requires higher peak powers. With traditional multiphoton CGH [38], and even point scanning methods [33], spatial resolution along the optical axis is determined by how rapidly the power density is attenuated as light propagates into and out of the targeted area. These often result in significant but undesired photoexcitation above and below the target. In practice, physiological spatial resolution is highly power-dependent, and single neuron spatial resolution (e.g., axial FWHM ~30 μm) [9, 39] is generally impossible across large volumes, even with high numerical aperture (NA) objectives.

To eliminate the tradeoff between target dimensions in the (x, y) plane and decreased axial (z) resolution [40, 41], 3D-SHOT relies on temporal focusing (TF) [40, 41] where a diffraction grating decomposes femtosecond pulses into separate colors, such that the different wavelengths components within the original pulse propagate along separate light paths. Each component of the decomposed pulse has a narrow spectral bandwidth and is therefore broadened in time, which dramatically reduces the peak intensity and prevents two-photon absorption. However, two-photon absorption can be enabled again when the original pulse is retrieved by constructive interference of all the chromatically separated components at conjugate images of the diffraction grating [40, 41]. TF restricts multiphoton absorption to a narrow (z) depth that depends on the grating's spatial frequency, not on the dimensions of the targeted area. TF has been successfully applied for selective two-photon tomographic fluorescence imaging [42, 43], has been implemented with mechanical scanning [44], and with random-access volume sampling of functional fluorescence [45]. A detailed presentation of TF is available in Chap. 9.

For two-photon photostimulation applications, TF can activate opsins over a wide area matching the neuron's shape in the focal plane, without compromising depth specificity [39, 46]. TF also mitigates scattering [47, 48] even through thick layers of brain tissue [49, 50] (see also Chap. 9). Although multiphoton CGH with TF can achieve wide-field photostimulation with cellular resolution and high temporal precision, most implementations only enable excitation within a single 2D plane [20, 39, 46]. Thus, neurons located above or below the focal plane are not addressable, a necessary condition for many experiments designed to interface with neural circuits in vivo, where relevant neurons may be localized anywhere in the 3D volume of interest. Multi-level temporal focusing has been implemented with holograms tiled into clusters on the SLM surface which can be individually defocused in space by applying digital lens patterns on a second SLM [51]. This strategy is limited in the number of distinct depth levels used before in-plane resolution is degraded, constraining the neuronal population that is simultaneously addressable with optical stimuli.

To overcome this outstanding challenge 3D-SHOT leverages the advantages of CGH to simultaneously address custom 3D locations on demand, and TF for enhanced spatial resolution at the scale of individual neuron soma. 3D-SHOT forgoes the ability to create custom patterns to make TF and 3D CGH compatible, instead it replicates multiple identical copies of a temporally focused excitation pattern matched to the dimensions of a neuron soma at each target location. The result is a technology that is specifically tailored to optogenetic photostimulation applications and enables single-shot in vivo photoactivation of custom neuron ensembles distributed anywhere in the accessible volume, with single-neuron spatial resolution.

Understanding the causal relationship between neural activity and behavior is a fundamental goal of neuroscience. However, this causal inference requires manipulations that act at the scale of natural activity, i.e., writing temporally precise patterns of activity in many cells with single-cell specificity. While optical approaches such as spiral scanning or 3D-SHOT can confine light to the dimensions of a single cell, molecular actuators, also known as opsins, are required to convert light into neural activity. The properties of these opsins and how they interact with the stimulation system will determine how well one can drive precise trains of activity. The fast and potent opsin ChroME [11] was engineered alongside the development of 3D-SHOT to achieve this goal, but is only a single example of a class of opsins with appropriate speed and sensitivity properties that have yet to be discovered. As such, the fourth section of this chapter outlines the criteria necessary to select an opsin optimally paired to 3D-SHOT for the purpose of writing precise trains of activity into groups of neurons.

Fundamental to the goal of writing temporally specific patterns is to pair extremely fast opsins with "flash"-based optical approaches, i.e., those that simultaneously illuminate an entire cell such as 3D-SHOT. In order to drive precisely timed action potentials faithfully and at high rates, the underlying evoked photocurrents must be very strong and very fast – both rising and falling very rapidly. Simultaneous illumination systems ensure that all possible opsin molecules are activated at the same time, leading to the shortest response time, while opsins with both fast kinetics and high conductance ensure that action potentials are faithfully driven, without the risk of creating doublets. Indeed, opsins with slower decay kinetics tend to have higher overall conductance but are not capable of driving pyramidal cells at high rates and often have large jitter in action potential timing [34, 52, 53].

Finally, multiphoton optogenetic stimulation is rarely performed as a standalone technique, and is almost always paired with calcium imaging. Special considerations are needed to match the properties of opsins and reporters to ensure that both multiphoton optogenetics and imaging are compatible for simultaneous use. At present, the most successful imaging approaches rely on "green" calcium indicators (i.e., the GCaMP series) that absorb ~920 nm matched to "red" opsins with longer excitation wavelengths. However, even opsins with peak absorption >1000 nm are sensitive to these imaging wavelengths. We discuss these constraints and the mitigation approaches that are available to combine multiphoton optogenetics with imaging under those circumstances.

Precise control of neural activity cannot be achieved by any one technique alone. New optical developments, new molecular tools, and new approaches to unite these devices are needed to reach the next step of precise causal manipulation of neural systems. As technologies are constantly being developed and improved; new experimental capabilities provide a path to answer previously intractable neuroscience questions. By combining different skillsets and expertise, the technological solutions that emerge are better than what would have been created from the perspective of a single discipline. Our intent is for this chapter to serve as a guide and resource for future users who will implement and improve upon multiphoton optogenetics and usher in a new epoch of neuroscience discovery.

#### 2 Methods

#### 2.1 3D-SHOT Optical System Design

Our experimental setup, shown in Fig. 1, is based on the standard design of a holographic microscope with a spatial light modulator (SLM) in the Fourier domain (pupil plane). The SLM shapes the phase of a coherent femtosecond laser light source to synthesize custom 3D shapes [54, 55] digitally synthesized with Computer Generated Holography [35–37] (CGH). Unlike scanning approaches, CGH wide-area holograms matched to the dimensions of each neuron's soma enable simultaneous, flash-based, activation of a large number of opsin molecules, yielding photocurrents with fast kinetics [39].

In most brain structures, neurons are distributed continuously in 3D, not in discrete layers. Therefore, the inability of 2D optogenetics approaches (i.e., 2D CGH with temporal focusing) to target neurons at any arbitrary number of axial planes simultaneously is a major obstacle for large-scale optogenetic interrogation of neural circuits. To overcome this outstanding challenge 3D-SHOT leverages the advantages of 3D-CGH, to simultaneously address neurons in custom locations. To make 3D CGH and temporal focusing mutually compatible, 3D-SHOT forgoes the ability to create custom patterns. Instead, the optical path is optimized to holographically replicate multiple identical copies of a temporally focused excitation pattern, termed "custom temporally

Fig. 1 Experimental setup for 3D-SHOT. This is made of two consecutive optical systems. First, a diffraction grating and a rotating diffuser are imaged onto each other by a f-f optical relay. This assembly shapes femtosecond laser pulses both spatially and spectrally to create a custom temporally focused pattern (CTFP) matched to the dimensions of a neuron soma. The resulting engineered point spread function is then spatially modulated by a second system that enables 3D computer-generated holography (CGH). A spatial light modulator (SLM) placed in the Fourier domain modulates the phase of the CTFP to target custom 3D locations with a point-cloud hologram. The resulting sculpted illumination pattern replicates identical copies of the CTFP at each targeted location. The 3D hologram is further demagnified by a tube lens and a microscope objective. A zero-order block eliminates any remaining undiffracted light from the hologram. The grating frequency determines the spectral dispersion, "a", and the diffuser determines the beam dimension "b" at the surface of the SLM. Those parameters, along with the focal lengths of lenses L1–L4, are adjusted to match the desired addressable volume and CTFP dimensions within constraints imposed by the SLM size, the laser source, and the numerical aperture of the microscope

focused pattern" (CTFP). The CTFP is specifically engineered to match the dimensions of a neuron's soma, and to be compatible with 3D CGH so that identical copies of the CTFP, with individually specified brightness can be placed at each target neuron anywhere in the accessible 3D volume. The result is a technology that is tailored for optogenetics applications and enables single shot in vivo photoactivation of custom neuron ensembles distributed anywhere in the accessible volume, with single-neuron spatial resolution.

To implement 3D SHOT in a multiphoton microscope, the first step is to create a static, temporally focused object matched to the dimensions of the desired target. For this, we illuminate a reflective blazed diffraction grating with a femtosecond laser light source. The incidence angle is adjusted so that the first diffracted order reflects orthogonally to the surface of the grating. The grove depth, material, coating, and incoming wave polarization are adjusted to best match the desired LASER wavelength and optical power density and to maximize the amount of light in the first diffracted order. Lenses L1 and L2 are in a 4-f configuration and create an exact optical relay that place a virtual copy of the diffraction grating at the surface of the transparent, rotating diffuser. The rotating diffuser applies an engineered (gaussian) phase pattern to the temporally focused image that is continuously randomized by the mechanical rotation. This phase perturbation is necessary to distribute the energy in the Fourier domain (i.e., to uniformly illuminate the SLM), a critical step that enables the compatibility between 3D CGH and temporal focusing and maximizes the diffraction efficiency.

The CTFP is matched to the characteristic dimensions of a neuron soma by selecting the magnification M ¼ f2/f1, where f<sup>2</sup> and f<sup>1</sup> designate the focal lengths of lenses L2 and L1 respectively. The axial confinement (temporal focusing) can be adjusted by selecting diffraction gratings with a higher (or lower) grating spatial frequency (in lines per mm). Both properties can be adjusted independently of the additional phase perturbation induced by the rotating diffuser.

A second 4-f system made with lens L3 and L4 with focal length f3, resp. f<sup>4</sup> relays the CTFP, first to the SLM in the pupil plane, then to the volume where the 3D hologram is first synthesized. In this configuration, the SLM applies a multiplicative phase pattern in the Fourier domain, which corresponds to a convolution operation in the real domain. To utilize 3D SHOT for neural stimulation, a CGH algorithm only needs to compute a hologram made of a 3D cloud of diffraction-limited points centered on each target neuron, and the optical system will yield one copy of the CTFP at each target point. To compensate for spatially dependent diffraction efficiency, and non-uniform losses through the optical system, the respective brightness of each target can be adjusted by computing digitally compensated holograms that redistribute the available laser intensity across each point of the 3D cloud based on expected losses, and the total energy can be adjusted globally across all targets by modulating the power of the laser beam.

2.1.1 3D-SHOT Design Parameters 1. The size of the CTFP must be adjusted to match the dimensions "d" of a neuron. In a typical implementation of 3D SHOT, the 3D hologram (see Fig. 1) is relayed into a microscope with an additional tube lens (L5, f ¼ f5) and microscope objective (L6, f ¼ f6) with magnification M ¼ f5/f6. The dimensions of the CTFP is M·d in the 3D hologram, Mdf3/f<sup>4</sup> at the rotating diffuser, and correspond to an incoming beam of width M·d·f<sup>3</sup> f1/( f4f2) at the blazed grating.

2. The rotating holographic diffuser is a transparent material with an engineered surface that deflects incoming light in a Gaussian pattern angular distribution with a characteristic diffraction angle, αd. The diffuser spatially stretches the wave in the Fourier domain by an amount b (see Fig. 1), optimized to ensure an even illumination of the SLM active area, and given by.

$$b = f\_3 a\_d$$

3. The line spacing of the blazed diffraction grating, l, or spatial frequency, fg, (l ¼ 1/fg) must be adjusted to match the spectral bandwidth δλ of the femtosecond laser, and the desired dimensions of the CTFP along the (z) axis. The angular dispersion of the diffraction grating is given by α ¼ δλ/l, and stretches the pulse in the Fourier domain. At a distance f<sup>1</sup> from lens L1, the dimension of the stretched pulse satisfies af2/f<sup>3</sup> ¼ α f1. Hence, the spectral dispersion, a, at the SLM, is given by:

$$
\mathfrak{a} = \delta\_{\mathfrak{k}} f\_1 f\_{\mathfrak{z}} / (\!\!f \!f\!f\!f\!f\!\!/ \!2).
$$

The numerical aperture NA of the microscope objective is generally the limiting factor that defines the accessible volume and CTFP minimal dimensions for 3D-SHOT. The SLM pattern (see Fig. 1) of width a + 2b is imaged onto the back aperture of the microscope objective. Hence, to fully capture the light modulated by the SLM, a suitable design constraint is to ensure (a + 2b)f5/ f<sup>4</sup> < NA f6. In this configuration, the characteristic size, δz, of the CTFP along the (z) axis in the demagnified hologram under the microscope objective is given by

$$\delta\_{\sharp} = \lambda \left( f\_4 f\_6 / \left( \mathfrak{a} f\_5 \right) \right)^2$$

Guidelines for 3D-SHOT 3D-SHOT is implemented as a secondary system on the path of a multiphoton microscope that is also designed for two-photon imaging, typically with a secondary laser. Imaging and photostimulation are most efficiently merged with a dichroic mirror or a polarizing beam splitter cube. When using a polarizing cube, the orientation of the SLM, grating, and path-merging cube must be adjusted to match polarization constraints, with one path (e.g., photostimulation) being reflected and the other one being transmitted so that the merged beams are co-aligned. Any incompatibilities can be resolved by inserting additional half-wave plates along

2.1.2 Implementation

the optical path. However, since no element has perfect transmission any additional optical element will reduce the overall power throughput. 3D SHOT is best assembled by first building the laser beam line at a fixed height on an optical table, then by installing the additional optical components starting from the beam merging cube or dichroic mirror, and in the sequential order outlined below.


The laser should operate at minimum power levels during the alignment procedure. Some laser models are fitted with co-aligned low-power red lasers that may be used to safely align the entire system with the exception of the diffraction grating tilt that must be aligned for the desired excitation wavelength. The SLM surface is one of the most sensitive areas and light should not be focused on it (this may happen when placing lens L3). The safest approach is to temporarily replace the SLM with a flat mirror and to align the SLM at the end by focusing undiffracted light on the zero-order block.

3. The second design step is to replace the collimated beam illuminating the SLM with the engineered CTFP. For this, one must place and align lenses L3, L2, L1, in that order, in successive 4-f configurations. Pinholes and far-field images of the infrared beam can be used to check centering and beam collimation, respectively, to ensure that each newly inserted lens is properly centered and spaced. The reflective grating is placed last, at a distance f<sup>1</sup> from lens L1 with its reflective surface orthogonal to the optical axis.


During operation, special attention should be given to ensure that the rotating diffuser is spinning before allowing high laser power settings to avoid SLM damage.

2.2 Characterization and Performance Metrics for 3D-SHOT Our primary goal is not to render a visually accurate hologram but instead to increase contrast for two-photon excitation at selected locations while avoiding inadvertent photoactivation of non-targeted areas, which relaxes constraints on hologram computation. Hence, the traditional metrics used to characterize imaging systems such as resolution, contrast, and speckle are not adequate to evaluate the capabilities of 3D-SHOT in experimental applications.

> Instead, to evaluate the capabilities of 3D-SHOT and quantify how two-photon absorption is spatially distributed in 3D, we placed a thin fluorescent film on a microscope slide under the excitation objective to record the corresponding two-photon fluorescence image with a fixed sub-stage objective coupled to a camera

Fig. 2 Optical characterization of the spatial resolution of CGH vs 3D-SHOT. (a) We used a fluorescent calibration slide and an inverted microscope to quantify two-photon excitation in 3D. (b) For conventional holography, we consider a 10 <sup>μ</sup>m diameter disk target, and show (from top to bottom) projection views of two-photon absorption in the (x,y), (z,y), and (z,x) planes. (c) With 3D-SHOT, the CTFP was adjusted to a 10 <sup>μ</sup>m diameter target and the same projection views were recorded. (d) We measured the FWHM of the radial (top) or axial (bottom) PSF measured through brain slices of varying thickness (\* indicates p <sup>&</sup>lt; 0.05, Kruskal-Wallis Test with multiple comparison correction. Data represent the mean and standard deviation of n - 5 observations for each thickness of brain tissue). (Adapted from Pegard et al. [7])

(Fig. 2a). We recorded tomographic slice images by mechanically moving the excitation objective along the z axis by 1-μm increments. The resulting data correspond to a quantitative 3D measurement of two-photon absorption induced by the CTFP. We first consider the case of conventional 3D holography (Fig. 2b). We computed a 10 μm disk image target at z ¼ 0 where we imposed a high-frequency speckle pattern to maximize spatial confinement along the z axis. Projection views of two-photon absorption along the y, x, and z axes show how, even in an optimized hologram, inadvertent photostimulation remains possible above and below a neuron targeted with this method. With 3D-SHOT however, experimental results (Fig. 2c) show that temporal focusing significantly enhances spatial confinement along the z axis in the CTFP.

¼ ¼ Since 3D-SHOT is developed for the primary-use case of optogenetic stimulation in brain tissue, we also recorded the effect of propagation through scattering medium on the radial and axial confinement of 3D-SHOT excitation. We cut acute mouse cortical brain slices of varying thickness and placed them between the excitation objective and the fluorescent slide. Recording two-photon absorption through physiologically relevant amounts of brain tissue revealed that scattering degraded radial resolution only after passing through 400 μm of mouse brain (Fig. 2d, 200 μm: p ¼ 0.22, 300 μm: p ¼ 0.21, 400 μm: p ¼ 0.001, Kruskal-Wallis). Although the axial point spread function exhibited apparent degradation when imaged through brain tissue, this decrease was statistically significant only after passing through 300 μm of scattering tissue (Fig. 2d, 200 μm: p ¼ 0.56, 300 μm: p 0.04, 400 μm: p 0.05, Kruskal-Wallis).

2.2.1 Scanless 2P Optogenetics Using 3D-SHOT

For neurobiological applications, we evaluated the spatial resolution of 3D-SHOT via quantitative measurements of the photocurrent amplitudes elicited by optogenetic stimulation in neurons. To do so, we expressed microbial opsins in neurons through in utero electroporation of mice embryos. Opsin expressing neurons were then brought under the objective either in acutely prepared cortical brain slices or in vivo with head-fixed animals.

To measure the spatial resolution of optogenetic excitation (or "Physiological" Point Spread Function – PPSF) we recorded the neuronal response to multiphoton photostimulation as a function of the displacement between the holographic target and the patched cell (Fig. 3a).

The efficacy of two-photon excitation is not only dependent on the shape of 3D-SHOT's CTFP, but also on the precise targeting of this pattern to the cell soma, the level of opsin expression in the targeted neuron, and the laser intensity. Computer-generated holography already offers micron-level spatial resolution for placing holographic targets onto the desired neurons with a microscope objective. However, the level of opsin expression varies from neuron to neuron, and consequently so does the required power level for photostimulation. We experimentally compared (Fig. 3a) the spatial confinement of 3D-SHOT and 2P-CGH (without temporal focusing) for photoexcitation as a function of incident laser power density. Toward this end we obtained voltage-clamp recordings of neurons, and recorded the PPSF at a variety of different laser powers (Fig. 3b). With conventional holography, we observed substantial photocurrents 25–50 μm above and below the disk image target, indicating that photoactivation of non-targeted neurons is likely to be an issue. Temporal focusing significantly enhances spatial resolution with 3D-SHOT, and photocurrents are more significantly attenuated above and below the primary focus (Fig. 3c). We observed that the axial resolution with

Fig. 3 3D-SHOT generates axially confined photoactivation. (a) A photostimulation pattern generated with CGH (top) or 3D-SHOT (bottom) was mechanically stepped along the optical axis (z) and passed through a cell expressing opsin. Photocurrents were recorded in the whole-cell voltage-clamp configuration in neurons. (b) FWHM of the characteristic response profile for both methods at various power levels. (c) Photocurrent response profile for CGH (left) and 3D-SHOT (right) with a 10 <sup>μ</sup>m disk target and different power levels. (d) Spatial profile of two-photon evoked spiking of a L2/3 pyramidal neuron in a mouse brain slice (left) in the radial dimension. Black: CGH; red: 3D-SHOT, p <sup>&</sup>lt; 0.56 Mann-Whitney U-test, and (right), along the axial dimension ( p <sup>&</sup>lt; 0.006, Mann-Whitney U-test). (e) Quantification of the FWHM comparing CGH and 3D-SHOT. (f) Full volumetric assessment of photostimulation resolution, points throughout the volume were tested, but only points that elicited spike probability greater than zero are shown. (Adapted from Pegard et al. [7])

3D-SHOT was significantly improved relative to CGH, even using several orders of magnitude more laser power. Whereas two-photon photoexcitation with CGH relies only on defocusing to confine the excitation light to the desired volume, 3D-SHOT benefits from simultaneous defocusing and temporal confinement, as femtosecond pulses are temporally stretched above and below the desired target which further attenuates the nonlinear response regardless of the targeted area in the (x,y) plane [41]. 3D-SHOT's shallow relationship between laser power and spatial resolution is critical, in that it allows sufficient excitation light to generate action potentials without significant loss of spatial confinement, as it normally occurs with CGH, and gives the user the option to use additional power to reliably stimulate neurons when the exact level of opsin expression is unknown without significantly affecting spatial resolution.

2.2.2 3D-SHOT Photostimulation with Single-Neuron Resolution We next quantified the physiological spatial resolution of CGH and 3D-SHOT in neurons by measuring the spiking probability along the radial direction in the imaging (x,y) plane and along the optical (z) axis. We compared holography and 3D-SHOT by projecting a single photostimulation target placed at a distance (x,y,z) from a patched neuron in mouse brain slice, either with single copy of the

¼ Fig. 4 3D-SHOT provides cellular resolution photostimulation in a large volume through digital focusing. (a) To quantify the spatial resolution of 3D-SHOT as a function of hologram target depth, we recorded the spike probability in cortical neurons while digitally targeting varying positions along the optical axis (z), and measuring resolution by mechanically sweeping the objective over the entire (z) range and measuring the response at each point. (b) Spike probability in cortical neurons while targeting the same cell from different axial displacements ( p <sup>¼</sup> 0.2, Kruskal-Wallis Test). (c) Spike probability resolution as a function of digital displacement – shaded green colors denote mechanical sweeps across the optical axis for different digital displacements. (d) Quantification of the FWHM for the axial fit of spike probability as a function of digital defocus from the focal plane ( p 0.17, Kruskal-Wallis Test). (Adapted from Pegard et al. [7])

CTFP with 3D-SHOT or a disk-shaped pattern of equivalent size using CGH. We measured spike probability as the hologram was displaced in small increments by mechanically moving the objective relative to the patched cell. Experimental results (Fig. 3d) show similar spatial resolution with both methods in the radial direction in the focal plane, with a FWHM of 10 2 μm for holography, and 9 μm 1.3 for 3D-SHOT ( p ¼ 0.57, Mann-Whitney U-Test) consistent with the dimensions of the disc and Gaussian patterns at the focal plane (Fig. 4a).

However, with conventional holography, the spike probability along the z axis does not permit single-cell resolution (FWHM ¼ 78 6 μm). In contrast, 3D-SHOT provides far superior resolution (FWHM ¼ 28 0.7 μm, p ¼ 0.006, Mann-Whitney U-Test, Fig. 3d) compatible with single-cell resolution in all three dimensions, in that the FWHM of spike probability is on par with the typical dimensions of a cortical neuron and their intersomatic spacing (Fig. 3e). We recorded from neurons in brain slices and measured the spiking probability in response to 3D-SHOT excitation by digitally refocusing a hologram to stimulate positions throughout a 50 50 100 μm (x,y,z) grid. This experiment revealed that the neuron was photoactivated only when the disc image was targeted to the cell body (Fig. 3f).

2.2.3 Spatially Precise Remote Control with 3D-SHOT A major advantage of holographic optogenetics is the ability to photoactivate neurons at different depths that are part of the same circuit. Since the major advance of 3D-SHOT is its ability to target temporally focused patterns arbitrarily in 3D, it is vital that 3D-SHOT maintains its ability to activate neurons with high spatial resolution even when digitally focusing light far from the zeroorder of the optical system (e.g., the center of the optical axis at

¼ the natural focal depth of the system). Therefore, we next evaluated the accessible depth within the volume of interest by measuring the activation and spatial resolution as a function of the distance from the holographic zero order. Toward this end we measured spike probability in neurons via current clamp in mouse brain slices (Fig. 4a). To test if the CTFP can be digitally displaced along the z axis, we systematically moved the digital focus of the hologram (with a lens term on the SLM), and accordingly corrected the mechanical position of the objective by the same distance (δzDigital ¼ δzMechanical). This test showed that 3D-SHOT effectively photostimulates cells at locations distal to the zero order, as photocurrent and spike-probability were not affected by digital offset in z (Fig. 4b, p 0.2, Kruskal-Wallis).

We next asked if the axial resolution of stimulation was constant when stimulating away from the natural focal plane at z ¼ 0. For this we measured the FWHM of the axial PPSF as a function of digital defocus. As before, we digitally moved the holographic target along the z axis, but instead of matching the digital and mechanical offset, we stepped the objective across the entire range of the z axis and measured the physiological response at each location. This allowed us to measure the axial spatial resolution of photostimulation from locations distributed on either side of the optical axis. Results show that 3D-SHOT effectively confines excitation to the desired depth range throughout the 180 μm range that we sampled, as the FWHM of stimulation did not change as a function of digital defocus in z (Fig. 5c, d, p ¼ 0.17, Kruskal-Wallis Test). These experiments show that 3D-SHOT retains axial confinement capabilities that are compatible with single-cell resolution for photocurrents and spike probability while targeting neurons at any depth within the accessible volume defined by the SLM and the microscope assembly.

In addition to being able to create multi-target patterns at arbitrary depths with precise power control, the utility of 3D-SHOT for various applications is also determined by the absolute targetable volume and the number of targets that can be placed in a single hologram with the appropriate spatial resolution. To test the limits of the volume that can be simultaneously targeted using 3D-SHOT in our setup, we randomly selected up to 75 points distributed throughout the 350 μm 350 μm 280 μm volume of interest, and generated a point cloud hologram simultaneously targeting all of these points. We measured 2P absorption using a sub-stage camera (as in Fig. 2) and we measured the radial and axial confinement of each spot in the multi-target hologram (Fig. 5a). Quantifying the radial and axial FWHM showed that adding additional targets (up to 75) did not degrade the confinement of light in multi-spot holograms (Fig. 5b, p ¼ 0.34 Axial FWHM, Kruskal-Wallis).

2.2.4 Volumetric Optogenetics at High Spatial Resolution

Fig. 5 Spatial resolution with simultaneous targets throughout a large volume. (a) 3D-SHOT was tested by simultaneously targeting 75 randomly distributed targets within the full operating volume defined by the SLM's spatial range for the first diffracted order. Projection views of the 3D reconstruction of 2P-induced fluorescence in a calibration slide are shown along the (y, z), (x, z) axis, with a 3D projection. (b) Similar experiments were repeated with 20, 30, 50, and 75 targets. The FWHMs of the two-photon response were computed for each target, and show that spatial resolution and axial confinement are not significantly degraded by increasing the number of simultaneous targets in any given hologram (axial FWHM: p <sup>¼</sup> 0.34, Kruskal-Wallis; radial FWHM: p <sup>&</sup>lt; 0.001 for 75 ROIs; <sup>p</sup> <sup>&</sup>gt; 0.05 for all other comparisons). (Adapted from Pegard et al. [7])

 Overall, our experiments show that with 3D-SHOT as with CGH, the accessible volume and available optical power under the objective depends on the diffraction efficiency of the SLM, the laser power, and on cumulative losses across the optical path. Altogether, these design parameters determine the number of neurons that can be simultaneously illuminated with the desired spatial resolution. Here, with 600\*800 pixels on the SLM, we characterized singleshot photostimulation of up to 75 targets (limited by laser power) with no degradation of resolution within a 0.034 mm<sup>3</sup> volume (350 μm 350 μm 280 μm, Fig. 5). For comparison, custom photostimulation patterns have been demonstrated in previous works within a 0.017 mm<sup>3</sup> operating volume with multi-level TF [51] (240 <sup>μ</sup><sup>m</sup> <sup>240</sup> <sup>μ</sup><sup>m</sup> <sup>300</sup> <sup>μ</sup>m), ~6.25 <sup>10</sup><sup>4</sup> mm<sup>3</sup> with point scanning methods (250 μm 250 μm 15 μm) [18, 33].

2.3 Calibration of 3D-SHOT with Imaging System The calibration and alignment of the optical system is critical to the successful use of any multiphoton stimulation system, this is made even more challenging when improving the resolution of such a system. Furthermore, whereas it is customary to report the best possible resolution in optics publications (to explain the potential of the technique), it is also known that the resolution is not constant throughout the entire working volume. However, for biological experiments, it is necessary to know what the actual resolution is at any given point in the working volume. Furthermore, even subtle errors introduced by aberrations of lenses and the SLM can lead to mistargeting problems that will prevent accurate experiments if they are not accounted for. To that end, we developed a calibration protocol and numeric tools to map the 3D holographic stimulation path with the multiplane two-photon imaging path. This calibration empirically accounts for aberration and deformities that are introduced both by the SLM and associated lenses, but also by the optomechanical defocusing method (in this example an electro tunable lens; ETL). This strategy provides both an improved calibration over less thorough procedures, and even more critically accurate measurements of the size of the holographic disk throughout the useable volume. While this procedure is quite slow, it is fully automated and can be run overnight (see Note 2.3.1). Scripts are available at https://github.com/ adesniklab/3D-SHOT/AutoCalib

In addition to the 3D-SHOT and 2P imaging system described above, the calibration requires a substage camera (Fig. 6a). While many camera objective pairings are theoretically useable, we used a Basler camera (acA1300-200um) with a 5 Olympus air objective (Olympus MPLFLN) and a thin fluorescent slide (see Note 2.3.2). Care should be made to match the substage camera's field of view to be at least as large as the imaging field of view.

#### Calibration Procedure


Fig. 6 Calibration protocol for 3D-SHOT. (a) Substage camera assembly for calibration with a uniform fluorescent thin film on a microscope slide. (b) A single hologram is imaged at 13 different planes by moving the hologram with respect to the thin fluorescent slide. The full range is 40 <sup>μ</sup>m from the estimated center of the hologram. (c) We fit a Gaussian curve to the measured fluorescence at each plane for each hologram recorded in (b). Relevant resolution characterization values (peak intensity, FWHM, and depth) are extracted for each hologram. (d) We first identify the relationship between the predicted SLM defocus and the detected depth of the corresponding holograms. (e) Mapped relationship between hologram FWHM and the hologram depth (left) is measured in the entire volume accessible by the SLM (right). (f) Hologram diffraction efficiency is measured throughout the field of view. (g) True depth of the two-photon imaging planes, as detected by the substage camera. We note that the planes are neither flat, evenly spaced, nor parallel to the axis of the camera, but that the calibration will account for all those discrepancies. (h) The final hole pattern in SLM space accounts for aberrations and curvature from both the SLM/stimulation path and imaging path. (i) Images of the holes ablated in the first plane, and for a subsequent plane. The hole pattern is asymmetric, so that subsequent planes will not burn in the same location. (j) Simulated targeting error over the full calibrated volume

using a test hologram near the zero order block and set the power such that it is just below saturating the camera pixels.


possible measurement of fluorescence with the zero-order block removed and a zero-order hologram. However, in practice this is unnecessary, the disruptions from removing and replacing the zero-order block are non-negligible, and the absolute value of the measured diffraction efficiency is less important than the exact profile along which it falls off.


measuring the error by attempting to map coordinates onto themselves with a "round trip" interpolation between any two coordinate systems. Additionally, since the last step ablates targets for which both SLMXYZ and CameraXYZ coordinates are known, a second mapping can directly compare SLMXYZ to 2PXY + ETLZ coordinates, but since this approach does not include a true measurement of Z it is assumed to be less accurate. Typically, the mean interpolation error for a calibration is <2 μm throughout the entire useable area.

(m) Troubleshooting. There are two very common sources of miscalibration. First, imaging conditions can change. This can occur due to evaporation of the immersion liquid, instability in lasers, or other sources. Sometimes the fluorescent intensity of holograms decreases over time. To detect this, plot the peak hologram intensity in order in which it was measured, as the locations are randomized, if any trend is visible it indicates some instability. Imprecise points can be manually excluded or the whole procedure can be repeated with steps taken to ensure that this problem will not occur again.

Second, the calibration slide may move. Since the calibration can take many hours, even subtle movement of the slide will disturb the calibration. This can be an insidious problem as depending on when this movement happens it can manifest in different ways. Prevention is the best remedy, and firmly securing the slide and the substage camera will mitigate this issue. A wise step at the end of a calibration is to conduct a hole test by ablating targets above and below a test location, with offsets of a few microns, to test the XZY accuracy (even 2 μm offset target will burn less efficiently than a properly targeted hologram). Typically, slide movements will only result in a XYZ offset and a digital offset can be applied without the need for a full recalibration. This approach can also help fix small post-calibration misalignments that may occur if there is any drift of the laser beam, when a full recalibration is not desired.

#### Notes

2.3.1: Overnight calibrations: While the described calibration is designed to run autonomously, several problems can arise that will disrupt it. First, the immersion liquid between the objective and the slide can dry up. To avoid this, we use a 1:10 dilution of Ultrasound Gel (NA 1.0, Parker Aquasonic 100), and create a well to hold excess liquid. Second, if the setup is much more susceptible to movements during calibration than it will be in its final form. We also recommend signage to ensure nobody disturbs the procedure.


The requirements for opsin constructs for 2-photon optogenetics differ considerably from those used for one-photon activation. Most notably, multiphoton approaches benefit from "somatargeted opsin constructs", i.e., those that are only expressed in the soma and proximal dendrites. Without soma targeting, the spatial resolution is compromised [11, 19, 56]. Furthermore, other opsin properties, such as photocurrent amplitude, absorption spectrum, and photocurrent kinetics, will strongly affect the experimental abilities of a 3D-SHOT system. These biophysical properties will interact with both the imaging and 3D-SHOT system and will alter the resolution, precision, and scale of neural control that is possible. We will briefly summarize how these properties interact, before describing a protocol for assessing opsin properties with regard to how to best activate or suppress a neuron. While many techniques are employed in this evaluation process, we will focus on those steps that are germane to opsin evaluation and two-photon optogenetics while directing the reader elsewhere for some technical procedures. While we will focus on selection of activating opsins, we will briefly discuss the additional concerns surrounding selection of a suppressing opsin.

We will also include, where possible and relevant, data on existing opsins. While many studies focus on individual features of opsin behavior, proper evaluation of an opsin requires a holistic understanding of many properties of those opsins. There are relatively few commonly used activating opsins used with multiphoton optogenetics the most common variants are ChR2 [28], C1V1 [8, 13], ChrimsonR, Chronos [15], ChroME [11], ReaChR [57], CoChR [56], and ChRmine [14] each with their own advantages and disadvantages. Opsins for two-photon suppression are less well characterized with only Arch [13], NpHR3, PsuACR, iC+ +, GtACR1 [11], and GtACR2 [58] being described.

We will select for opsins that:


#### Procedure

1. Subcellular Targeting.

When expressed in neurons, the opsin must be properly trafficked to the cell membrane but restricted to whatever extent possible away from the distal dendrites and axons (Fig. 7a). A sequence from the Kir2.0 channel [59] can be very helpful to export from endoplasmic reticulum, while a portion of the Kv2.1 channel [60] has become the standard (but not the only [56]) "soma targeting" motif, facilitating both membrane expression and restriction to the soma and proximal dendrites. It is advisable to use all subcellular targeting motifs even when testing opsins in

Fig. <sup>7</sup> Characterizing opsin characteristics for use with 3D-SHOT. (a) Comparison of non-soma-targeted opsin localization (left) to soma-targeted opsin localization (right). (Adapted from Mardinly et al. [11]). (b) Comparison of photocurrent FWHM (right) at different points on the opsin response function (left). Closer to saturation (dark blue), the actual FWHM of the photocurrent is larger than the theoretical opsin FWHM. (c) Comparison of photocurrent amplitude and kinetics for several commonly used opsins expressed in CHO cells. (Adapted from Sridharan et al. [61]). (d) Schematic of three common opsin kinetic metrics: left, time to peak used to measure opening kinetic. Center, desensitization. Right, tau off, a metric of decay kinetic. (e) Schematized absorption spectra for three opsin types compared with GCaMP absorption spectrum (dashed green). (f) Schematic of how, with fast imaging and slow opsins (top left) scan-induced photocurrents can accumulate to produce unwanted spiking, while in other conditions they decay and do not produce spiking. (g) Schematic displaying how under different stimulation conditions, short laser light pulses (left) can produce more or less reliable spike trains depending on opsin characteristics

reduced preparations such as CHO cells, as membrane targeting can affect photocurrents substantially. To begin opsin characterization, it is advisable to examine the trafficking of your opsin by creating a construct with the opsin fused to a fluorophore, even if ultimately a different fluorophore configuration is preferable. This way, one can ensure that the construct is well-trafficked to the membrane, while still restricted as much as possible to the soma. Furthermore, internal protein aggregation may be an indicator of toxicity. For more detailed discussion of toxicity assessment at multiple stages, see Note 2.4.1. In our experience, with the exception of incidences of overexpression, the gene delivery technique (e.g., AAV, IUE, transgenic) does not dramatically change intracellular trafficking patterns. Still, the most rigorous approach is to examine your particular opsin with the delivery mechanism you will use (see Note 2.4.2 for a discussion of delivery approaches).

#### 2. Photocurrents.

Photocurrent is perhaps the most obviously important measure of potency for an opsin. In many cases the opsin that fluxes the most current will be the most useful, as more potent opsins can drive more cells with less energy. While it may be convenient to compare the one-photon response to light, 3D-SHOT relies on the multiphoton process and thus one-photon responses are not an acceptable substitute (see Note 2.4.3).

There are two main criteria to consider when evaluating photocurrent. First, it is important to probe a large range of light intensities. Different opsins will saturate (i.e., reach maximal photocurrent) at different illumination powers and at different photocurrent levels. While it is necessary to reach a certain threshold to spike a cell (1 nA is a good approximation to spike a L2/3 pyramidal cell with a 5-ms pulse), the shape of this curve will impact your resolution. The minimal power needed to spike a cell will dictate the total number of cells that can be activated with a given microscope, and the total heat added for a given number of activated cells (see Note 2.4.4). In addition, the further this current is from the saturating current the better the effective resolution will be (Fig. 7b).

While peak current is often reported, the average current over a given pulse duration, or the total charge fluxed, is what ultimately drives cell activation and thus is a more relevant measure for photoactivation. This is especially important considering that many opsins show desensitization (see detailed discussion in step 3, Opsin Kinetics).

Available opsins differ greatly in photocurrent magnitude (Fig. 7c). While direct comparisons of all commonly used opsins is not available (though see Sridharan et al. [61] for comparison of many), only ChroME family [11, 61] and ChRmine family [14, 62] opsins appear to reliably reach the 1 nA benchmark. Cells expressing either CoChR [56], and ReaChR [57] occasionally reached this threshold, but not reliably. It is possible that further improvement of targeting and expression will help these opsins reach this benchmark.

3. Opsin Kinetics.

While selecting the opsin with the maximum photocurrent makes sense in many cases, it may come at the expense of speed. Fast opsins are necessary to take full advantage of the temporal control offered by 3D-SHOT and to write specific spike trains, with low spike jitter and high fidelity.

A fast opsin must both open and close its ion channel very quickly. If the closing kinetic is too slow, two or more action potentials may result from a single stimulation [34, 52, 53], and it may be impossible to drive cells to fire at high rates, disrupting the ability to write a known train. Similarly if the opsin opening kinetic is too slow, longer stimulation periods will be needed to drive the cell to fire and the uncertainty (jitter) in action potential timing [52] will increase.

Moreover, the opening (but not closing) kinetics of both activating and suppressing opsins [11, 46, 63] are often dependent on laser stimulation power, adding additional complexity [11]. As the precise mechanics that lead the opsin to transition between conducting states is not yet fully understood [64, 65], and even less is known about how these transitions may be affected by the two-photon process, there is often no way to model or infer kinetics based on the structure of an opsin alone. Instead, the only solution is to measure these kinetics with each new prospective opsin.

The opening kinetic is often measured using the time to peak or 90% current, but τ derived from an exponential fit can also be reported [8, 11, 15, 56] (Fig. 7d). During prolonged pulses, many opsins' photocurrents reduce with time, a phenomenon known as desensitization (Fig. 7d). This may be either through inactivation of a population of channels and/or by individual channels entering states with different conductance [53, 65]. At the cessation of a pulse, the opsin current decays gradually. This closing kinetic is typically reported by fitting an exponential fit and calculating the τ (Fig. 7d) [56]. At times, a double exponential may better fit the data than a single one [11].

Several groups have endeavored to identify or engineer very fast opsins for one- or two-photon use [11, 15, 66–69]; these promise to be valuable tools for optogenetic control. While mutations that speed up an opsin often come at the expense of total photocurrent, this does not appear to be an absolute rule (note the success of ChroME and ChETA [11, 68]). Furthermore, it is hard to know how fast is "fast enough"? Chronos and the mutant forms, ChroME opsins, are the fastest opsins used in 2p-activation to date, and can be used in vivo to drive spike trains with jitter much less than one millisecond [11, 61, 70]. Nevertheless, in the correct conditions even much slower opsins such as ChrimsonR [52], CoChR [56], or ChRmine [14] can reach jitter of about 1 ms. However, cells expressing these last three opsins struggle to follow rates over 20 Hz [14, 52], probably due to their slower off kinetics.

#### 4. Two-Photon Spectra.

The two-photon absorption spectrum of an opsin affects its compatibility with imaging approaches and suitability for use with the high-power lasers typically used for 3D-SHOT. A chief concern with simultaneous imaging and holographic activation is the phenomenon of crosstalk, in which the scanning diffraction-limited spot used to excite GCaMP fluorescence also activates opsin molecules (the reader may also refer to Chaps. 2, 3, 5, 6, and 11 of this book for an extended overview of this phenomenon). Opsins with low absorption in the range of wavelengths used to image GCaMP (typically 910–930 nm) reduce crosstalk. Unfortunately, many traditional opsins are highly activated by blue light, so this has required the development of many new opsins. Alternate calcium indicators that absorb in other wavelengths are also available but are much dimmer than available GCaMPs [71–73]. In addition, most commercially available high-energy lasers emit around 1030 nm [63]. Thus, the optimal opsin would have a peak photocurrent around 1030 nm with a comparative minimum at the wavelengths to image GCaMP (~930 nm) (Fig. 7e, see Note 2.4.5 for further discussion of alternate strategies).

It is well known that fluorophores have two-photon absorption spectra that are considerably different from their one-photon counterparts [74]. To assess the sensitivity at various 2 photon wavelengths, the simplest approach is to use a 2p imaging microscope with femtosecond laser (e.g., Ti: Sapphire oscillator) that is tunable across a large range of wavelengths. Since spectral response is not known to be affected by the light delivery method, using a scanning imaging system to photoactivate opsins is a suitable approach to compare the relative activation at different wavelengths and measure the spectral response profile. Recording photocurrents from CHO cells at different wavelengths while scanning will provide a two-photon spectrum for the opsin. Emission power varies with wavelength, so be sure to test power out of the objective at all testing wavelengths.

Of the common excitatory opsins, ChR2 [28] and CoChR [56] are blue-shifted making them suboptimal for pairing with GCaMP imaging (Fig. 7e). Several 2p-optimized opsins, including Chronos, ChroME [11], and ReachR [57] peak around 1000 nm, and more red-shifted opsins such as C1V1 [13], ChrimsonR [11], and ChRmine [14] peak beyond 1040 nm.

#### 5. Characterizing Crosstalk in Imaging Conditions.

In most experiments, multiphoton optogenetics will be paired with multiphoton imaging. It is important to expressly consider the relevant ways that these two systems interact. While the stimulation laser can create an artifact on the imaging system (see Note 2.4.6), the more insidious form of crosstalk is where the imaging laser can activate the opsin. This crosstalk results from opsins that are somewhat sensitive to imaging wavelengths, as discussed above. Even when these depolarizations (or hyperpolarization in the case of inhibitory opsin) are not enough to overtly change the spiking rate of cells they can cause significant currents which may alter the timing of action potentials.

Opsin kinetics and strength also influence compatibility with 2p imaging. A diffraction-limited spot is used for imaging scans across a cell in a very different way than 3D-SHOT. Here, fast opsins are again advantageous because activation will decay to baseline between imaging frames, thus not driving the cell to spike (Fig. 7f). Strong, slow opsins may be most vulnerable to the effects of crosstalk, but this can be addressed at least in part by interleaving several frames in different areas to extend the time between repeated stimulation. In contrast, as activation is proportional to dwell time of the imaging laser on a cell, approaches that increase this dwell time, such as having a smaller field of view, will suffer worse crosstalk [11].

Due to the many variables affecting crosstalk currents, including imaging speed, power, wavelength, opsin kinetics, and more, it is important to test the actual currents induced in your typical imaging preparation [11]. Empirical measurements with your opsin of interest and in the typical imaging conditions are essential to understand the level of crosstalk that you will experience. Wholecell recordings, even in ex vivo slice, under analogous imaging conditions will give the resolution needed to observe how large the imaging-induced photocurrents are.

#### 6. Characterizing In Vivo Spike Fidelity.

Ultimately the end goal of selecting an opsin is to cause neurons to fire action potentials. While many opsin/stimulation combinations can drive cell spiking [11, 34, 52] it is important to quantify the fidelity of this spiking. One should quantify to the fraction of cells spikeable, the fidelity of response (i.e., the fraction of pulses in which the cell spikes), the reproducibility of a response (i.e., the fraction of pulses resulting in one and only one spike, also known as no "doublets"), and the jitter of the resulting action potentials (to understand the limits of your timing control).

Furthermore, these evaluations should be performed at a variety of frequencies, and in conditions most similar to your biological experiments possible. Under different stimulation patterns and strengths, opsins may be more or less faithful (Fig. 7g), such as strong, slow opsins producing doublets at high stimulation powers. We recommend cell-attached or whole-cell recordings from the anesthetized animal. Of these in vivo criteria, being able to reliably evoke one and only one action potential per pulse is one of the most challenging and useful features. While this analysis has not been performed on all opsins both ChroME [11] and ChrimsonR [52] can be driven in the regime where one and only one spike is possible.

7. Considerations for Multiphoton Suppression.

Multiphoton suppression involves using an inhibitory or suppressive opsin (one that hyperpolarizes a cell) instead of an activating one to prevent a cell from firing. Multiphoton suppression employs many of the same concerns as multiphoton activation. Concerns around spectrum, photocurrents, imaging compatibility, and toxicity requirements are all very similar between activating and suppressing opsins.

The primary difference is in the requirements for kinetics, whereas activation requires a fast off kinetic to prevent doublets; suppressing opsins have no such constraints. In fact, slower off kinetics can be helpful, as they allow discontinuous (aka duty cycled) light to achieve continuous inhibition. Similarly, as it is rare to know the precise timing of a naturally occurring spike, there is a diminished need for a fast on kinetic, but this comes at the cost of needing to suppress activity for a long duration. With increased photostimulation durations come increased hazards of heat buildup, and corruption from optical artifacts. Furthermore, opsins that desensitize to two-photon stimulation, i.e., have a high peak current but a lower sustained current, such as iC++ [11], are disadvantaged as the peak current contributes less to the overall suppression of activity.

Finally, whereas photoactivation is a nearly binary process, a spike is either evoked or it is not, suppression is graded. Even if a given opsin stimulation can suppress spontaneous activity, a particularly large endogenous stimulus might overwhelm this inhibition. As such many aspects of benchmarking inhibitory opsins are harder. Nonetheless, GtACR2 [58], GtACR1, and Arch [11] have all been used to successfully suppress activity in vivo. Although of these Arch and GtACR2 only suppressed ~50% of spontaneous activity [11, 58], whereas GtACR1 was more effective [11].

#### 3 Conclusion

When selecting an activating opsin for very precise manipulation of spike trains there are a variety of factors that need to be weighed. Thus, a holistic approach that evaluates all qualities of the opsin is required to pick the optimal tool. Of the opsins that have been tested for their compatibility with two-photon optogenetics, only ChroME, ChRmine, CoChR, and ReaChR reach the photocurrent benchmark of 1 nA. Of these only ChroME, and ChRmine can be used to drive spikes with sub-millisecond jitter, and only ChroME responded reliably at rates above 40 Hz. However, as more opsins are developed for multiphoton use, this set of useable opsins will continue to grow. In the near future, there will be a generation of new opsins that are discovered or mutated from existing opsins that will advance our ability to control the firing of cells. As the field grows out of its infancy it is likely that more criteria will present themselves as essential for selecting the best opsin. Until they do, we hope this guide will aid in the benchmarking of future opsins for the precise recreation of neuronal activity patterns.

#### Notes


introduce variability, especially dependent on differences in the viral preparation [75]. Especially if new constructs are being created, the time to develop new viral delivery systems may be preventative. For this reason, we recommend using transfected CHO cells for opsin biophysics such as kinetics, and in utero electroporated cortical neurons for experiments assessing neuronal responses. For photocurrent assessment, while CHO cells will provide some information, we recommend using neurons. Subcellular targeting and the interactions between endogenous channels and opsins could potentially alter the effectiveness of different opsin constructs. For this reason, we recommend whole-cell voltage-clamp recordings in neurons in utero electroporated with the opsin construct of interest.


presents a potential confound that if not accounted for could change the results of a biological experiment.

2.4.6: Stimulation laser-induced artifacts in imaging. Just as the imaging laser can somewhat activate the opsin, so too can the 1030 nm stimulation laser drive GCaMP fluorescence, albeit sub-optimally. This creates an artifact whereby pixels recorded during stimulation will be contaminated and will appear brighter, in severe cases (such as when stimulating many targets) this artifact can be much brighter than the GCaMP fluorescence. Some groups have opted to exclude frames or pixels containing artifact [14, 18] but this can be preventative when stimulating at high rates, or for long durations, as is required during optogenetic suppression. Instead, we recommend syncing the firing of the stimulation laser to the phase of the fast-resonant mirror. By only allowing the stimulation laser to fire during the edge or flyback of the imaging field of view we ensure that very few neurons are obscured by the stimulation artifact. This can be achieved via a fast analog or digital circuit that controls a sufficiently fast electro-optic modulator controlling the stimulation laser's power. As the rate of the resonant mirror is ~8 kHz, this modulated cycle will be approximately ~16 kHz much faster than any known opsins on or off rate. This very fast duty cycle ends up being a significant advantage. While this "gating" of the laser might exclude up to 50% of the total energy hitting the sample, we see photocurrents that are only reduced by 15–20%.

#### References


A . h t t p s : // d o i . o r g / 1 0 . 1 0 7 3 / p n a s . 1017210108


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## An All-Optical Physiology Pipeline Toward Highly Specific and Artifact-Free Circuit Mapping

## Hendrik Backhaus, Nicolas Ruffini, Anna Wierczeiko, and Albrecht Stroh

#### Abstract

All-optical physiology of neuronal microcircuits requires the integration of optogenetic perturbation and optical imaging, efficient opsin and indicator co-expression, and tailored illumination schemes. It furthermore demands concepts for system integration and a dedicated analysis pipeline for calcium transients in an event-related manner. Here, firstly, we put forward a framework for the specific requirements for technical system integration particularly focusing on temporal precision. Secondly, we devise a step-by-step guide for the image analysis in the context of an all-optical physiology experiment. Starting with the raw image, we present concepts for artifact avoidance, the extraction of fluorescence intensity traces on single-neuron basis, the identification and binarization of putatively action-potential-related calcium transients, and finally ensemble activity analysis.

Key words All-optical physiology, Functional calcium imaging, System integration, Optogenetics, Event-related analysis of functional fluorescence traces, Stimulation artifact avoidance, Spectral independency

#### 1 Introduction

The advent of optogenetics, pioneered already more than a decade ago [1, 2], has revolutionized our understanding of the contribution of individual, genetically defined neurons to circuit function. For the first time, we could start to unravel the causal relation of a distinct genetically defined compartment of the network to wholebrain circuit dynamics and ultimately behavior. Optogenetics impinged upon all fields of neuroscience. For each of those fields, individual roadblocks needed to be overcome based on the principle of optogenetics, i.e., the illumination of a defined brain volume with rather high light intensity exceeding 1 mW mm<sup>2</sup> . In the field of single-cell electrophysiology, the problem of the Becquerel effect had to be solved by the design of specific electrodes, in the field of behavior the problem of flexible and non-tethered light delivery had to be solved. These obstacles could be removed already in the

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_5, © The Author(s) 2023

first years upon the advent of optogenetics [3]. Yet, there is one field of neuroscience, which maybe poses the ultimate challenge for the integration of optogenetics: optical functional imaging. The field of optical imaging, particularly of individual neuron function in the intact tissue, relies on detecting subtle changes in light intensity [4]. The implementation of two-photon microscopy in combination with calcium indicators in the early 2000 allowed for the first time a functional readout of neuronal circuits comprising of several hundred neurons in the rodent cortex [4–6]. This leads to a tremendous advance in our understanding on how complex circuit dysfunction arises, particularly also in the early stages of neurological disorders [7–11]. The ability to identify a rather small fraction of dysregulated neurons in the circuitry makes the true difference in comparison to single-cell or population readouts. And of course, also here, the implementation of optogenetics holds tremendous potential, as a causal manipulation of, e.g., single dysregulated neurons and the simultaneous readout of circuit function would truly advance our understanding on cause and effect in network disorders. And indeed, in 2012 [12, 13] the efficient two-photon excitation of opsins became a reality, followed by combined two-photon imaging and two-photon optogenetics [14–17]. In these early proof of concept studies, it already became apparent, that all-optical approaches pose unique challenges, both in terms of system integration, experimental design, and not the least, analysis. First, conventional line-scanning excitation schemes have proven to be rather less efficient. Second, only a few opsins seem to be two-photon excitable, and there is yet no molecular or structural predictor, which could guide molecular engineering of opsins to modify their two-photon cross section. Thirdly, as mentioned above, the light intensity used for two-photon optogenetics ranges orders of magnitude higher compared to the light intensities used for the excitation of fluorophores, creating problems in terms of indicator bleaching and tissue heating. Lastly, also the analysis poses problems in terms of artifact removal and synchronization. All of this prevented up to now the broad implementation of all-optical approaches, despite their promises. Here we will focus on the entire workflow for an all-optical experiment in circuit neuroscience, reporting on recent advances, and giving guidance for the unique requirements of all-optical physiology.

#### 2 Experimental Framework

Circuit neuroimaging historically originated in invertebrate species, and, alongside with the technique maturation and development, has been introduced in the application of various model organisms, such as Zebrafish, allowing, e.g., for the whole-body imaging of transparent zebrafish larvae [18]. In recent years, optical imaging of local circuits has been predominantly applied in mouse models. This is on the one hand due to the availability of mouse models of human disorders, and also due to the rather preserved cortical cytoarchitecture in comparison to humans. The limitation of penetration depth and of course the invasiveness of the method currently seems not to allow the implementation in human preclinical research. Here, we focus on the implementation of all-optical approaches in mouse cortex. Cortical networks are critically involved in fundamental tasks of the brain, starting from sensory processing [19–21], to decision making [22–24] and mechanisms involved in consciousness [25, 26]. Optical neuroimaging using genetically encoded calcium indicators [27] can resolve the activity of local neuronal population even with single-action potential (AP) resolution of sparsely firing neurons. Recent advances [28, 29] even allow for the detection of cortex-wide activity of thousands of neurons in real time. Certainly, despite these advances, genetically encoded calcium indicators are inherently slow, so single APs can only be resolved if the inter-AP interval is sufficiently long, preventing the resolution of individual spikes, e.g., of fast-spiking interneurons, such as PV interneurons [27]. What is more, the frame rate of current commercially available two-photon microscopes ranges at 30 Hz for full-field imaging, at least for the majority of microscopes equipped with resonant scanners [30]. Using random-access scanning, the temporal resolution can be increased, but at the expense of a limited ability to do posthoc operations for movement correction. While inertia-free AOM systems allow for a much higher framerate [31] (see also Chap. 3), the current standard in the field is single-plane full-field imaging at 30 Hz, a good trade-off between speed, signal-to-noise ratio (SNR), and field of view size. This has important implications for the integration of optogenetics in an optical microcircuit imaging framework. With an effective temporal resolution of 30 Hz, the window of coincidence which can be resolved equals 33.3 ms. It is therefore imperative to design a framework which avoids the loss of even a single imaging frame. Here, with these limitations in mind, we put forward concepts focusing on artifact avoidance or removal, to retain the utmost temporal resolution. Even more so, for the relation of a per se unspecific signal, i.e., the increase of fluorescence intensity to an underlying AP, specific criteria need to be met in terms of the temporal dynamics of an AP-related event. The entire workflow of an optical functional imaging approach has to be tailored, starting from the design of the hardware, to the experiment itself, and the subsequent analysis. This chapter does not strive for giving an overview of all possible solutions of the integration of optogenetics for all-optical experiments, but rather focuses on a few tried-and-tested pipelines.

#### 3 Technical Framework for Functional Neuronal Circuit Mapping

An optimal system integration needs to be tailored toward the specific requirements for addressing the neurophysiological research question. Investigating action-potential-related neurophysiological events mandates a precise temporal synchronization of all acquired signals and interventions, in our case opsin excitation, to ultimately gain evidence of cause and effect in the investigated neural circuit.

Multichannel-data-acquisition (DAQ) interfaces can be employed to collect signals from all involved subsystems. We would strongly suggest to design a hierarchical configuration, in which all subsystems, including the microscope, report the critical signals, such as the frame trigger to this unifying and synchronizing device. Commonly used manufacturers are National Instruments (Austin, Texas, USA) or Cambridge Electronic Design (Cambridge, UK) [32–34]. Integrating multichannel-data-acquisition hardware in the microscope itself is also a viable solution offered by many major microscope companies. Connecting this device with a suitable integration software generates a master file for each experiment, containing individual traces for all relevant signals, such as breathing rate, temperature, frame trigger, and certainly optogenetic or sensory stimulation pulses. Referring to the mentioned manufacturers, LABView is used for National Instruments interfaces, whereas DAQ boards from Cambridge Electronic Design use the software package Spike2. Particular care has to be exerted in terms of gathering all relevant information important for the current experiment in this master file. For example, the frame trigger signal of a microscope can give insights at which absolute time point images were acquired. For tactile stimulation, a TTL-pulse can be generated by the DAQ boards itself, avoiding varying time delays as occurring with using PC-based USB control. Thereby, the experimenter gains full control and a complete report of all relevant systems parameters without temporal jitter.

#### 4 Somatic Calcium Influx as Correlate of Suprathreshold Neuronal Activity

4.1 Two-photon Raster Scanning Functional calcium imaging utilizes the action-potential-related calcium influx into a cell as a correlate of neuronal activity [35– 37]. A genetically encoded calcium indicator (GECI) is undergoing conformational changes upon binding of calcium ions, causing a change in its fluorescence properties [38]. Currently, the most commonly employed GECI is GCaMP6. It has to be mentioned that there are different subtypes of GCaMP6: GCaMP6s (the "s" stands for "slow"), exhibits a high SNR but slow off kinetics, GCaMP6m ("intermediate") exhibits medium, and GCaMP6f ("fast") fast kinetics [27] (see Note 1). We recommend using GCaMP6f whenever possible, particularly for event-related analyses with the aim of extracting action-potential-related calcium transient with utmost temporal resolution. The excitation spectrum of GCaMP6f has its maximum at 497 nm [27] under a one-photon excitation regime. However, the penetration depth using the principle of one-photon excitation is limited, inter alia caused by light scattering in the brain which requires a pinhole to avoid detection of scattered light that does not derive from the focal point [39, 40]. Furthermore, the deposited energy in the brain by photons of this wavelength can lead to phototoxicity and bleaching of the indicator [41]. The development of microscopes using the principle of two-photon excitation combined with raster scanning approaches [42] overcame these limitations and therefore revolutionized functional neural circuit imaging [5]. Furthermore, pinhole solutions became dispensable since all emitted photons necessarily originated from the fluorophores excited at the focal point.

Fluorescence excitation by two-photon excitation requires two photons being absorbed by the fluorophore within a time window of less than 10<sup>16</sup> s [40], resulting in a tremendously low probability for a two-photon excitation. Even though two-photon excitation can be achieved by continuous-wave lasers [43, 44], the development of mode-locked femtosecond pulsed lasers, like the Titanium-Sapphire (TiSa) laser (Fig. 1a), enabled focusing light at high-intensity spots and thus efficient two-photon excitation at the focal point at moderate laser power in vivo [42]. Two-photon excited fluorescence images can be acquired by employing raster scanning of the focused laser beam (Fig. 1a), where the beam is deflected by a galvo-resonant mirror system to the specimen covering the field of view (FOV) [5]. The dimensions of the FOV depend on the chosen objective and its numerical aperture (NA).

Fig. <sup>1</sup> Principles of functional calcium imaging. (a) Scheme of two-photon-based readout using a femtosecond laser and raster scanning principle. The laser is focused by the objective and the focal point is guided by a galvo-resonant scanhead over the field of view. Emitted fluorescence is collected and detected by a PMT. (b) One-photon Miniature Microscopes used for freely moving animals require a miniaturization of light sources, achieved by bandpass filtered LEDs. A full-field illumination via a GRIN lens is carried out and a CCD-Chip collects emitted fluorescence

The duration that the laser beam remains focused on a pixel is defined as the pixel dwell time. For fluorescence excitation, a pixel dwell time of 1 μs is commonly applied, resulting in frame rates of 30 Hz for galvo-resonant systems [9]. Note that this temporal resolution is critical for the subsequent identification of AP-related calcium transients based on their temporal dynamics.

The emitted fluorescence light is collected by the objective and is deflected to a photo-multiplier tube (PMT). In front of the PMT, a spectral bandpass filter ensures that only light in the bandwidth of the fluorophore's emission band is detected.

In the field of in vivo calcium imaging, a recently emerging method are miniature microscopes, mounted on an implanted baseplate on the head of the animal. In this chapter, we solely refer to one-photon miniature microscopes, whereas Chap. 7 gives a detailed introduction to two-photon miniature microscopes. Recently, the first three-photon miniature microscope was proposed by the group of Jason Kerr [45]. Miniature microscopes paved the way for the combination of functional imaging with behavioral assays in which the animal moves freely. What is more, brain regions not accessible by two-photon microscopy due to its limited penetration depth can be targeted. First developed by the group of Mark Schnitzer [46], nowadays several manufacturers offer ready-to-use systems (INSCOPIX, Palo Alto, CA, USA [47]) but also open-source systems are being provided by the community, e.g., the UCLA Miniscope [48] or the FinchScope [49].

Retaining the image information also in deeper brain regions is afforded by a gradient-index optic (GRIN lens, Fig. 1b). GRIN lenses have a cylindrical shape with a radially decreasing refraction index, and planar surfaces for optimizing optic interfacing. These two factors, a gradient of refractory index and the planar surface, retain, at least to some extent, the image information throughout the passage of light through the GRIN lens. It is important to note that there is a deterioration of image quality with increasing the length of the GRIN lens. As a light source, a LED in combination with spectral bandpass filters is used [47], and the image is typically recorded by a CCD chip. The sampling rates for functional calcium imaging differ depending on the chosen model, ranging from 15 Hz with a field of view of 1440 by 1080 pixels (nVoke, Inscopix, Palo Alto, California) up to 60 Hz at a field of view of 752 by 480 pixels (Miniscope-v4, UCLA, Los Angeles, California). The following issue needs to be considered: the reliable recording of neuronal layers depends on the location of the GRIN lenses tip, neurons located 100–300 μm below the tip can be resolved by electronically adjusting a focusing lens [47]. However, changing the x-y-position of the recorded field of view is not possible and is predefined by the implantation site of the GRIN lens.

#### 4.2 One-photon Miniature Microscope Full-Field Imaging via GRIN Lens

#### 5 The Principles of Optogenetic Manipulation Methods in a Nutshell

There is a vivid research in the field of molecular biophysics on the development of new opsin classes paralleled by the improvement of existing opsins by molecular engineering. The researcher needs to decide which opsin is best suited for the question at hand, i.e., should a specific component of the network be silenced or stimulated. Importantly, at this time, there are no a priori predictions on the suitability of a given opsin for two-photon excitation. Here, we cannot give a comprehensive guide on the specific requirements for any specific opsin. We suggest caution on the use of newly developed opsins for non-experts in the field of optogenetics: Achieving strong and stable expression of an opsin requires dedicated and careful titration steps [33, 50, 51]. Also, it so happened more than once, that a new opsin, while promising at first, showed rather negative cytotoxic effects, requiring the development of new versions [52]. Consequently, if there are no critical requirements such as kinetics or specific ion conductance, we would strongly suggest to use tried-and-tested opsins, even if they might not be tailored to the specific neuronal class. For example, if a neuronal subpopulation is probed which exhibits an intrinsic firing rate exceeding the optimal kinetics of a given opsin, it might still be advantageous to choose this opsin. Remember, the opening time of an opsin is not determined by the duration of the excitation pulse, but by the opsins' intrinsic dynamics, described by the parameter τoff, i.e., the time at which the probability of an open state is reduced to 1/e [50]. While this can result in not achieving to optogenetically mimic the intrinsic firing frequency, it might increase the overall probability of a successful all-optical experimental approach. Critically evaluate, which experimental aims are mandatory, and which parameters of the experimental design can be adapted. This is of particular importance for an all-optical experiment, requiring the optimal expression of indicator and opsin, and all the other technical requirements stated above and below. Spending too much time on individual components of this chain might reduce the chance of completing the overall aims of the given study in the inherently limited time frame. It is already quite complex to set up an all-optical experimental framework, and we suggest to limit the complexity whenever possible.

5.1 One-photon Raster Scan Opsin Excitation For one-photon optogenetic interrogation, typically a solid-state laser at the optimal excitation wavelength for the respective opsin should be employed [50, 53]. Opsins successfully used for the manipulation of neurons in a one-photon regime include the veteran ChR2 (134) for AP initiation [2, 54] and ArchT for inhibition [2, 51]. For these tried-and-tested opsin pairs, two solid-state lasers can be employed, with 488 nm for ChR2 and 552 nm for ArchT,

Fig. 2 Schematic overview of system combinations for opsin excitation. (a) Two-photon functional calcium imaging can be combined with one-photon or two-photon raster scan methods to target opsin-expressing neurons. The method of Computer Generated Holography is a scanless approach and enables the experimenter to excite several neurons at the same time under a two-photon regime using spatial light modulators (SLMs). (b) Miniature microscopes equipped with a second LED for opsin excitation are based on one-photon full-field illumination delivered via an implanted GRIN lens

provided that the light intensity at the excitation site is <sup>1</sup> mW mm<sup>2</sup> [55]. The one-photon light source is coupled to the microscopes beam path via optic fibers. The region of interest is raster-scanned typically by a galvo-galvo scanning head (Fig. 2a). To avoid the detection of the laser pulse for opsin excitation, a notch filter is blocking the excitation wavelength from entering the PMTs. Note that there might be a problem in terms of the drastic increase of autofluorescence during the one-photon excitation pulse, in the wavelength bandwidth of the signal of interest, here of GCaMP, leading to a stimulation artifact. Alternatively, if necessary, the PMTs can be blanked by a shutter during opsin stimulation, which results in a sacrifice of signal detection during stimulation. This might pose problems for subsequent trace analysis, as particularly detecting the stereotypical onset dynamics of an AP-related calcium transients is vital for ensuring specificity. Apart from focusing on a single neuron, this approach can also be used to illuminate a section of or the whole field of view to excite a larger subset of opsin-expressing neurons. However, the downside of this approach is that not only the layer recorded in the field of view gets excited, but also neurons located above or below the recorded layer can get excited as long as they are exposed to a light intensity that exceeds 1 mW mm<sup>2</sup> , as depicted in Fig. 3a.

Fig. 3 Mechanisms for opsin excitation. (a) One-photon excitation by a focused beam is carried out by raster scanning the target neurons. Neurons located above or below the imaging plane are exposed to the excitation beam cone as well. (b) One-photon full-field paradigms are used in miniature microscopes and do not allow for neuron-specific excitation but illuminate the entire FOV at once, eventually exciting neurons below the recorded layers. (c) Two-photon raster scanning approaches overcome this disadvantage due to the physical principle of two-photon excitation, reaching the necessary photon density only in the focal spot. (d) The scanless paradigm also enables experimenters to simultaneously excite several target neurons under a two-photon regime

5.2 Two-photon Raster Scan Opsin Excitation

Utilizing the same physical mechanism like in functional two-photon calcium imaging, two photons of roughly the double wavelength of the optimal one-photon excitation can evoke an activation, i.e., conformational change of an opsin. It has to be noted that there cannot be an a priori assessment on the suitability of a given opsin for two-photon excitation regimes (see Note 2). One of the first opsins probed for two-photon excitation represents C1V1, typically excited at around 1080 nm [12, 13]. Yet, excitation beyond the standard range between 1030 and 1100 nm of Ytterbium lasers [56] is possible. We explored wavelengths of up to 1250 nm using an Optical Parametric Oscillator (OPO) as light source, being able to decrease the spectral cross-talk [34]. There is a wide range of possible light sources for two-photon optogenetic raster scan-based excitation: A tunable Ti:Sa laser is well suited for wavelengths up to 1080 nm. However, the femtosecond pulse dynamics of a two-photon imaging laser is not as critical for opsin activation as it is critical for imaging, since it is not necessary to obtain the shortest pixel dwell time possible. Therefore, fixed wavelength Ytterbium lasers with lower pulse repetition rates became the most widely applied light source [57]. Since opsin activation is based on conformational changes of the protein initiated by an isomerization from all-trans to 13-cis retinal upon photon absorption, the physical principle differs from functional calcium imaging with fluorophores, i.e., GECIs. Therefore, the pixel dwell time is, besides the chosen objectives point-spread function (PSF) and laser intensity, a crucial factor. Studies suggest a pixel dwell time of at least 3.2 μs for efficient excitation [13, 33, 58].

The raster scan method enables the experimenter to define several target neurons that are scanned successively in the same plane (Figs. 2a and 3c). Depending on the chosen line spacing, this can lead to rather long excitation times of up to 100 ms. Please note that there is an ideal line spacing value, to cover the entire neuronal membrane. The minimal excitation volume of a microscope depends on the theoretically reachable optical resolution, defined by the full width at half maximum (FWHM) of the PSF, and can be estimated by the magnification of the objective and the chosen wavelength. The reachable optical resolution can be calculated with <sup>Δ</sup><sup>r</sup> <sup>λ</sup> 2∙ ffiffi 2 <sup>p</sup> <sup>∙</sup>NA [59]. With a given wavelength of λ ¼ 1100 nm and a NA ¼ 1.0, an average neuron surface of 20 20 μm and a pixel dwell time of 6 μs, it takes 16 ms to cover the entire neuron. For spiral excitation patterns, a similar, yet slightly shorter stimulation time is needed, ranging at 12.5 ms per neuron. If the experimenter aims to excite successively 6 neurons, the excitation process itself takes 96 ms even without consideration of the time required to move from target to target. These technical roadblocks limit the effective use of a raster scanning-based approach, at least if the entire membrane of the neuron needs to be excited for AP-inducing depolarization or AP-suppressing hyperpolarization, respectively. The development of opsins with either a higher ion conductance or more efficient two-photon excitability might reduce the membrane area which needs to be excited, and consequently decrease the stimulation time.

#### 5.3 Two-photon Scanless Opsin Excitation via CGH

As described in Subsection 5.2, if more than two neurons need to be efficiently activated within the time window of a typical imaging frame, a two-photon raster scanning approach is reaching its limits. Parallel excitation methods in principle allow for an infinite number of simultaneously activated regions of interest (ROIs), limited only by the laser power and the spatial resolution afforded by the spatial light modulator used [59]. Parallel methods most commonly use Computer Generated Holography (CGH) for the generation of the excitation pattern [60] (Fig. 2a). In CGH, liquid crystal spatial light modulators (LC-SLMs) are integrated into the beam path to modulate the phase of the electric field of the laser beam, and typically a low-repetition rate, high-energy Ytterbium laser is used as a light source [57]. A precise control of the spatial specificity of the calculated phase hologram is achieved by applying a Fourier transform-based iterative algorithm on previously acquired fluorescence images of the region of interest [61]. A detailed description of the technical aspects of this method can be found in Chaps. 4 and 11.

With this approach, an effective, simultaneous excitation of several neurons is amenable. The illumination targets do not need to be located in the same z plane, neurons above or below the imaging plane can be interrogated as well, provided that the location of all targets is known (Fig. 3d). Depending on the respective microscope and illumination concept, neurons can be excited which are located several hundreds of μm distant from the current z plane. Yet, upgrading an existing microscope setup with this technique can be cumbersome and expensive, as major changes in software and hardware need to be done. However, it might be an interesting option for scientists experienced in two-photon microscopy to upgrade a microscope with this technology. A further interesting option represent hybrid solutions, combining CGH and scanning [12]. These hybrid solutions guide the illumination patterns generated by the SLM on a galvanometric mirror. These patterns comprise multiple typically rather small focal points, but with high light intensity. This cloud of focal points is then scanned by the galvanometric mirror, resulting in each focal point to cover an entire given neuron either using line or spiral scanning approaches. Thereby, a limitation of CGH-only approaches in terms of the ever-decreasing light intensity when increasing the number of neuron-sized patterns is circumvented. For a constant laser power, the number of neurons which can be effectively activated by a two-photon regime, as discussed in the previous section, and in [12] is consequently higher with hybrid solutions (see also Chap. 11).

#### 5.4 One-photon Miniature Microscopes with GRIN Lenses

Recent one-photon miniature microscopes are capable of opsin excitation via an implanted GRIN lens. Typically, for optogenetics, a second LED light source equipped with corresponding bandpass filters is used in these systems. The excitation paradigm is based on one-photon excitation (Fig. 2b).

A clear advantage of using GRIN-lens-based solutions is the ability to specifically target deep structure for optogenetic interrogation, without undesired activation, e.g., of passing opsinexpressing neurons or neurites dorsal of the target region. The ventral regions below the GRIN lens are illuminated by the light emitted by the lens, in the geometrical shape of a frustum (Fig. 3b). An estimation of the effective penetration depth, i.e., exceeding the threshold for opsin excitation, can be achieved using the Kubelka-Munk model of light transmission through diffuse scattering media [55, 62, 63]. The light intensity is decreasing by the distance to the lens surface, estimated by:

$$I(z) = I\_0 \, T(z) G(z),$$

where I<sup>0</sup> is the light intensity at the tip of the lens, T(z) the transmission factor according to the Kubelka-Munk model for diffuse scattering media, and G(z) is a geometry factor. The transmission factor is defined as:

$$T(z) = \frac{1}{\mathcal{S}z + 1},$$

with <sup>S</sup> <sup>¼</sup> 11.2 mm<sup>1</sup> , the damping constant in mice [55]. As light is spreading in a conical shape from the tip of the lens, the angle of divergence is calculated by <sup>θ</sup>div <sup>¼</sup> sin <sup>1</sup> NA n , where NA is the numerical aperture of the GRIN lens and n ¼ 1.36, the refraction index of gray matter in the brain [64]. Using the definition ρ ¼ r ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n NA <sup>2</sup> <sup>1</sup> q , the geometric component, the conical light spread

can be calculated as:

$$G(z) = \frac{\rho^2}{\left(z + \rho\right)^2}.$$

Therefore, the intensity depending on the depth is defined as:

$$I(z) = \frac{I\_0 \rho^2}{\left(\mathcal{S}z + \mathbf{l}\right)\left(z + \rho\right)^2}.$$

For a GRIN lens with a diameter of 200 μm, a numerical aperture (NA) of 0.5, a light intensity of 35 mW mm<sup>2</sup> at the tip of the lens, and an excitation wavelength of 630 nm, the effective penetration depth for opsin excitation assuming a minimal threshold for excitation of 1 mW mm<sup>2</sup> is 650 μm with a geometric loss of 0.08, resulting in a total volume of 0.118 mm3 with the KubelkaMunk model. Note that the lateral spread close to the tip of the fiber may be underestimated by the Kubelka-Munk model [54, 62], at least for highly scattering shorter wavelengths.

#### 6 Everything You Always Wanted to Know About All-Optical Data Processing But Were Afraid to Ask

Independently of the chosen technical implementation, the goal of any pipeline for data processing is to extract a subset of features from the raw data, either based on a priori hypotheses, or by datadriven unsupervised methods. Here, we present a pipeline based on the idea of an event-related analysis: We put forward the notion that any functional circuit imaging of a neuronal ensemble at the end only serves as a tool to extract the underlying action potential train of each individual neuron. Consequently, we provide a step-by-step guide to extract putatively AP-related calcium transients from the raw images. Once this optically derived matrix of AP-related events is generated, e.g., by transforming an extracted intensity trace to a binarized train of zeros and ones, already well-established correlation or interaction concepts can be employed on these dimensionality-reduced datasets.

All-optical experiments require a tailored pipeline for each technical implementation and mandate the avoidance of falsepositive identification of putatively AP-related calcium transients.

6.1 Roadmap for Processing All-Optical Data Prior to image data acquisition, the synchronicity of all necessary subsystems needs to be guaranteed (Fig. 4a). This can be achieved by copying all relevant trigger and stimulation pulses to a multidata acquisition interface. These signals include the frame trigger of the microscope, the logic level controlling the light source for opsin excitation, and biomonitoring such as the breathing rate and body temperature. Upon digitalization, the signals are being centrally displayed by a control software, and a master file for each experiment is generated. Reading out the master files allows for a post hoc exact assignment of each raw image to a given time and the status of, e.g., the stimulation regime. 6.1.1 System Integration

6.1.2 Image Data Acquisition The initial step for analyzing the data is to reduce the impact of movement artifacts (Fig. 4b). We differentiate between two types of artifacts: displacements induced by movement in the x-y plane, and in z plane. While changes in the x-y plane can be corrected retrospectively by algorithms based on Hidden-Markov-Model [65] or discrete Fourier transformation-based image alignment [66], provided that the field of view is sufficiently large, movements in the z plane are more complicated to correct. Note that the possibility for x-y movement correction represents the main reason why random-access scanning is not recommended, and the field is

Fig. 4 Workflow of a pipeline for an event-related analysis of all-optical data. Raw data is segmented into ROIs and intensity traces are extracted. A binarization of transients is carried out and a temporal and spatial classification of the recorded microcircuit is applied

currently mainly conducting single-plane resonant imaging. Simply put, if a soma of a given neuron is displaced below the current x-y imaging plane, this information cannot be regained. A potential solution for this problem is to record a 3D volume. However, this results inevitably in a reduction of temporal resolution for any given neuron recorded. If working with currently most used resonant systems with a frame rate of 30 Hz, we would strongly discourage recording 3D volumes, as the temporal resolution is critical for subsequent identification of putatively AP-related calcium transients. This is a prime example on how analysis needs should inform the previous technical implementation steps: Take all efforts to minimize movement by devising mechanical stability, this cannot be stressed enough (see Note 3).

6.1.3 Segmentation The raw images comprising typically <sup>512</sup> <sup>512</sup> pixels contain several biological compartments: neuronal somata, axons, dendrites, blood vessels, and certainly a multitude of non-neuronal cells, such as astrocytes. While the one critical advantage of microscopy using fluorescent indicators represents the reduction of image complexity, i.e., ideally only the features of interest express the fluorophore, still, only a fraction of the image contains the signal of interest. What is more, in the context of neuronal microcircuit imaging, it is of advantage to integrate several pixels which reflect the same functional compartment: when the experimenter aims for identification of the suprathreshold activity of individual neurons in the microcircuit, each neuron can be identified as functional unit. Consequently, each neuronal soma is defined and segmented as ROI (Fig. 4c). In recent years, the range of applications that support the experimenter in ROI segmentation grew rapidly: mathematical models to perform automatic segmentation based on deep learning algorithms drastically shorten the time-consuming step to manually identify neurons in the recorded images [67]. While there are deep learning-based methods that process the average of all images, other techniques integrate the temporal information of neuronal activity by using subsets of all images for the segmentation of active ROIs [67, 68]. The approach proposed by Soltanian-Zadeh et al. is based on a 3D technique: subsets of the acquired images are created and used to predict 2D probability maps for active neurons utilizing a neural network. Upon applying a threshold to exclude low-probability regions from each probability map, individual somata are extracted from high-probability regions [67]. Ultimately, all somata positions from each image from the image sequence are combined to acquire a final output of active somata areas. However, manual segmentation of functional calcium imaging data by marking neuronal somata with polygon shapes still prevails due to the applicability on datasets of varying quality, e.g., datasets of a particularly low signal-to-noise ratio, when deep learning algorithms reach their limits. Depending on the strength of GCaMP6 expression and the overall SNR ratio, if targeting somatic changes in calcium concentrations, a decontamination of the ROIs containing the neuronal soma from neuropil signal might be useful. The decontamination is carried out by expanding a given ROI in cardinal and diagonal directions, beyond the region of the soma, which are then separated into several neuropil regions surrounding the initial ROI. The size of each neuropil subregion should equal the size of the initial ROI. Under the assumption that the intensity trace of a given ROI is generated by a mixture of different underlying signal components, but certainly one of the components exhibits the somatic signal of interest, non-negative matrix factorization or independent component analysis are employed to perform a blind source separation, resulting in a separation into the underlying signal components. By weighting the presence of each component in the central ROI, under the assumption that the somatic signal is the prevailing component contributing to the ROIs intensity trace, the strongest signal component is defined as the somatic signal [69].

¼ 6.1.4 Trace Extraction After the locations of somata constituting a ROI have been identified within the image sequence, the signal over time of each ROI is extracted (Fig. 4d). The intensity values of all pixels in a given ROI are averaged for every image of the temporal sequence, resulting in an intensity trace for every ROI, with a temporal resolution determined by the frame rate. It has to be noted that the absolute level of intensity is based on multiple factors such as autofluorescence, the expression levels of the GECI, or stray light entering the objective. Therefore, it is recommended to calculate the relative change of fluorescence, as these dynamic changes, depending on their temporal dynamics, are most likely due to changes in intracellular calcium levels, and can therefore be termed calcium transients. Note that there is inevitably a drift of the baseline levels due to bleaching. These drifts can be compensated for, as long as linear operations are being used (see Note 4). There is no convention on how to perform baseline correction. An option used by us and others represents the definition of a sufficiently long period of quiescence, i.e., stable baseline non-interrupted by any transients, and defines this period as baseline (F0), separately for each neuron [7, 9]. The relative change of fluorescence is then calculated by relating the intensity of each time point (F) to the baseline fluorescence ΔF <sup>F</sup>F<sup>0</sup> <sup>F</sup><sup>0</sup> [70].

6.1.5 Artifact Removal Prior to binarizing the extracted intensity traces, potential photostimulation artifacts superimposing the signal need to be identified and, if possible, corrected (Fig. 4e). Characteristic properties of photostimulation artifacts can serve as a basis for reliable identification and subsequent correction. Firstly, looking only at individual responses to an optogenetic stimulus might lead to the notion of a physiological signal. Yet maybe the most decisive difference between an artifact and a physiological response is the inherent variability of the physiological signal. Artifacts, with rare exceptions, are rather consistent. Overlaying and averaging the individual responses therefore gives important cues on the probability of a physiological origin. For that, the temporal section of a trace upon the photostimulation is assessed by a nm matrix M, with n ¼ stimulus intervall, m ¼ total frames/n. Figure 5b shows an overlay of the intensity traces of all photostimulation periods of Fig. 5a (gray lines).

> Photostimulation artifacts exhibit several typical features: Firstly, a photostimulation artifact will show both a sharp onset, and a sharp offset. Functional calcium transients of physiological

All-Optical Physiology Pipeline 153

Fig. 5 Evaluation of photostimulation artifacts. (a) Artifacts caused by photostimulation do not represent a physiological signal. Onsets of photostimulation are indicated by black triangles. (b) By overlaying (gray) and averaging (red) periods of photostimulation, an estimation on the impact of the artifact can be made. Here, 100 sample points prior to and 200 sample points after the photostimulation onset, indicated by the black triangle, are considered. The averaged signal waveform can be used in an algorithm to minimize the photostimulation artifact. (c) Subtracting the averaged waveform depicted in (b) at each stimulation onset can reduce the intensity of the photostimulation artifact. Note that signal components containing the physiological signal of interest can be altered by the algorithm as well and, in the worst-case scenario, will be completely eliminated

origin might also display a rather sharp onset, due to the high affinity of calcium to the indicator represented by a small dissociation constant KD and the high temporal gradient of calcium influx. But it will be characterized by a slow decay, an inevitable consequence of the high affinity of the indicator to calcium. The off kinetics does not mirror the true time course of the decrease of somatic calcium concentration. If a researcher would be interested in assessing the re-uptake of calcium, an indicator with a high KD/ low affinity would be advantageous, but these indicators typically display a lower SNR [71]. Any trace deflection with a sharp decay when using a low KD indicator is therefore almost certainly not associated with an AP-related somatic calcium response. Secondly, if the duration of the intensity deflection equals the duration of light administration, a physiological origin is highly unlikely. As depicted in Fig. 5b, the intensity traces during photostimulation exhibit both of the mentioned features.

What is more, but this very much depends on the imaging setup used, the latency between the onset of the stimulation pulse and the putative response is critical: If the latency ranges <3 ms, it cannot be considered as a physiological response, due to the time needed for the AP initiation and the influx of calcium into the cytosol. Please note that this criterion can only be used if the sampling rate of the GCaMP emission channel is high enough to resolve durations less than 3 ms. Lastly, a physiological response might be subject to adaptation, and often times does not occur upon every stimulus, e.g., due to changes in local inhibition [54]. Consequently, while a signal deflection might surely be the result of a physiological response if it occurs upon each stimulus, nonetheless, if a response is drastically changing its amplitude, or sometimes is not present, it is likely to be of physiological origin. To test for that, it might be useful to modulate the light intensity of the optogenetic stimulus: if a decrease of stimulation leads to the sudden disappearance of a transient, it is again likely that an AP-related origin can be assumed, as a technical artifact would simply scale with the light intensity, even though also non-linear correlations can occur. Certainly, a control experiment comprising of indicator-only expressing cells should be conducted in any case.

The subsequent correction of photostimulation artifacts in the intensity trace is a delicate task that needs to be conducted carefully. A possible approach is to employ non-negative matrix factorization [72] to identify the background noise of a given ROI containing the stimulation artifact. In a second step, the identified noise component is subtracted from the ROIs signal [73]. Here, for illustration purposes we obtained the averaged intensity trace of all time periods of photostimulation (Fig. 5b) and subtracted the given value from the raw intensity trace (Fig. 5a). This results in an intensity trace devoid of at least the majority of the non-physiological signal sources (Fig. 5c). Please note that the processed intensity trace has to be treated with caution: we can still observe fluctuations in the signal that could be misinterpreted as functional calcium transients. Furthermore, using methods based on complex mathematical models often act as a black-box, making it difficult to grasp the underlying methods for non-experts. We strongly emphasize to design the experimental paradigm in a manner to avoid any stimulation artifact during the measurement and to minimize the need for post-processing of raw data as much as possible. Characteristic temporal dynamics and latencies of the used GECI, as put forward before, can serve as a sanity check, and, following up with an event-based identification of AP-related calcium transients is mandatory.

6.1.6 Event-Related Binarization Action potential-related calcium transients exhibit typical signal dynamics and waveforms. Methodologically, the rather stereotypical dynamics of the somatic calcium influx and re-uptake respectively efflux is convolved with the binding dynamics of the fluorescent calcium indicator [74]. Therefore, each binarization approach critically has to be tailored to the calcium indicator used. The resulting AP-related calcium transient can be described based on parameters such as rise time, decay, and duration, or, alternatively, an idealized transient is used for performing deconvolution approaches [75–77]. A rather straightforward but yet effective method, with rather low false positives [7, 9, 34] represents a threshold-based algorithm. For that, first, the standard deviation of the noise band of the trace is determined, ideally in a section of the trace not containing the signal of interest. In the case of an all-optical experiment, time periods outside of stimulation events and with low spontaneous activity should be used. Once the signal exceeds this threshold, and if additional criteria such as minimal duration are met, a putative event is detected. Furthermore, as the decay of an AP-related calcium transient typically follows a roughly exponential decay, at least for commonly used indicators such as GCaMP6 and OGB-1, a regression of the trace with an exponential function can be conducted. The goodness of fit for this regression gives additional evidence for the physiological nature of the given transient. By identifying onset, peak, and offset for each detected transient, we can represent the fluorescence signal as a binarized event array (see Notes 5 and 6). Note that also deconvolutionbased methods provide such a binarized array. All following analyses are carried out on these reduced datasets (Fig. 4f).

6.1.7 Connectivity Analysis The coherent activity of neurons of a given microcircuit or ensemble in the field of view is calculated and visualized based on two variables for each neuron: first, the window of co-incidence, in which two events are considered to occur simultaneously, has to be defined (Fig. 4g).

> Certainly, for achieving information on Granger causality [78], this coincidence window should be as short as possible, ranging at only a few milliseconds. Yet, in the field of imaging, the minimal coincidence window cannot fall below the imaging frame rate, for resonant scanning approach this ranges at 33.3 ms, and it might even be necessary to further lengthen this co-incidence window, particularly in scan-based imaging methods. Otherwise, coincidence would be biased by the individual location of a given neuron in the scan field. Within this given coincidence interval, at least one other neuron needs to exhibit an event. Second, the average number of simultaneously active neurons is determined. The binarized calcium transients of each neuron i are used to identify the number of active neurons per frame by introducing a counter ci, registering the coincident active time points, and an array Ai, storing the number of coincident active neurons for each time point. Now, we ask for each active time point j if the sum of active neurons is nact, <sup>i</sup> > 1. If this is the case, the counter ci is incremented by 1 and Ai is appended by nact, <sup>i</sup> 1. Afterward, ci is divided by the total number of active time points defined by Xi, representing the relative number of coincident active time points of each neuron i and the mean of Ai gives insights into the average number of coincident active neurons.

> The number of simultaneously active time points ci relative to the number of all active frames per neuron ci/Xi are plotted against the overall number of active frames of the respective neuron Xi. Additionally, a linear regression analysis is applied to the data.

> The average number of simultaneously active neurons can be visualized using box plots. The distribution of the average number of coherent active neurons are compared by a statistical test, Students t-test, or Wilcoxon rank sum test.

To gain insights into the connectivity pattern of the given microcircuit, the information about the position of a ROI, its activity, and the correlation between each pair of ROIs, a network graph can be constructed. Each ROI is represented by one vertex. The position of each vertex is given by the ROIs x and y position. The color is based on the ROIs activity. In the example given in Fig. 4, the color scale is constructed as a non-linear gradient between green, yellow, and red in order to show single ROIs as hypoactive (green), normal (yellow), or hyperactive (red). The pairwise Pearson's correlation between each pair of ROIs is represented by the width of the edges.

Together, such a connectivity analysis takes advantage of the spatial information obtained by an imaging experiment, which at least partially overcomes the inherent drastic limitations in terms of temporal resolution compared to methods such as single-cell patchclamp recordings. Only both methods together, high temporal resolution of electrophysiological methods, directly acquiring the signal of interest, and functional imaging obtaining the local functional architecture, can provide a holistic understanding of neuronal network (dys)function.

While the strong light intensities needed for an effective excitation of an opsin pose the aforementioned problems for simultaneous functional calcium imaging, we need to also consider a putative impact on the constant excitation of the calcium indicator in terms of unwanted activation of opsins (see also Chap. 2). The activation of an opsin requires a distinct quantal energy; in case of the ChR2 based opsins or the cis-trans isomerization of retinal, this threshold ranges from 0.1 to 1 mW mm<sup>2</sup> . All-optical one-photon experiments, even using the same wavelength for imaging and opsin activation, is therefore possible, as long as the excitation intensity ranges below this threshold [32, 47, 54]. However, for two-photon line scanning conditions, the effective light intensity per pixel can well exceed this barrier. Nevertheless, fortunately, there is also a temporal barrier. It so seems that any pixel dwell times below 3.2 μs may not suffice for efficient opsin excitation at least for opsins such as C1V1 [12]. Pixel dwell times of regular resonant scanning range well below that number. However, novel generations of opsins designed for more efficient two-photon excitation might allow shorter pixel dwell times.

Avoiding cross-talk in the GECI emission channel is an important criterion in the design of all-optical experiments. Both for one-photon and two-photon regimes, the excitation light can be rather easily prevented from entering the PMTs, e.g., by notch filters. But nonetheless, the strong light pulse might lead to a broadband increase in autofluorescence, also in the emission band of the fluorescence calcium indicator, leading to a decrease in SNR.

#### 6.2 Toward Cross-Talk-Free Experimental Designs

6.2.1 Assessing the Impact of Continuous Illumination for Calcium Imaging on Opsin Excitation

6.2.2 Increasing the Spectral Separation Between Opsin and Indicator to Minimize Optogenetic Stimulation Artifacts on the Imaging Data

While in principle, the utmost spectral separation of opsin and indicator excitation wavelengths is advisable, there might be good reasons to choose an opsin indicator pair with similar or even identical excitation wavelengths, if, e.g., both indicator and opsin are best suited for the scientific question at hand, as mentioned above. Post-hoc identification of photostimulation artifacts is manifold and should be conducted with utmost care. Common approaches compare the dynamics of the signal's intensity during photostimulation to the expected dynamics of calcium transients [17] to make an assumption of the presence of a physiological response (see also Subheading 6.1.5). As a consequence, the specificity of the analysis toward the detection of AP-related calcium transients is reduced. Other approaches sacrifice pixels that are considered to contain signal components that represent an artifact [79, 80] at the cost of a reduction of the field of view. Another option, if a spectral separation of excitation and emission wavelengths cannot be implemented, is to perform gating of light sources, inevitably leading to a loss of data [81]. Yet, it is possible to avoid any stimulation artifact altogether, which has to be tailored for each opsin/GECI pair (see Note 7). For the pair GCaMP6/ C1V1, artifact-free all-optical physiology experiments are possible if using excitation wavelengths at or exceeding 1100 nm while simultaneously imaging GCaMP6 fluorescence at 920 nm [34]. Employing a blue-shifted opsin and a red-shifted calcium indicator resents another viable option [58].

#### 7 Outlook

All-optical physiology in neuroscience, i.e., simultaneously recording and manipulating individual functional components of a given microcircuit, opens up truly new experimental designs [80]. Particularly in the field of research on preclinical rodent models of neurological disorders, these "dream" experiments can provide evidence for the impact of the (dys)function of individual neurons, and neuronal ensemble activity to network performance and behavior. Yet, all-optical physiology did not at all exploit its full potential yet. The reasons are manifold: First, the ideal opsin for efficient induction of action potentials, mimicking the temporal dynamics of physiological inputs, while tremendous progress has been made, is not available yet. While efficient two-photon excitation seems to have almost achieved this goal, particularly the efficient twophoton-based inhibition is still in its infancy [82]. Nevertheless, the opsin and the stimulation method is only a part of the entire workflow. We need to move beyond rather synthetic, non-physiological stimulation patterns, which often evoke non-physiological responses, e.g., due to hypersynchronization of the network caused by the simultaneous activation of multiple neurons. For that, we not only need to record the neuronal activity but also analyze the activity pattern in real time. Only then, we will be able to re-play and meaningfully modify the endogenous ensemble activity, closing the loop. In this chapter, we covered key aspects of system integration and the basic concept of analysis of all-optical functional data. Indeed, in our view, the entire workflow, from the acquisition of raw images to the identification and binarization of events, and the design of the stimulation pattern, has to be transformed toward real-time closed-loop performance [83], as put forward in the context of global functional brain state changes [84]. This requires first of all dedicated fast hardware. Each component of the system: microscope, optogenetic pattern generator, signal acquisition, and signal computation, has to operate on real time. This is in reach for the microscope, optogenetic pattern generator, and signal acquisition, as put forward in this chapter, but, is still far from achievable for an event-based signal processing and signal analysis in real time. Achieving real-time capability of the critical analysis step requires the close collaboration between neuroscientists, mathematicians, and (bio-)informaticians. First, the neurophysiological event of interest has to be identified and described in painstaking detail. For instance, in our own research interest involving state transitions of local populations, these events needed to be discerned from the large parameter space of neurophysiological events, and described in terms of their spatiotemporal dynamics and its variance. Only then, the mathematicians can either take advantage of statistical methods, or of new unsupervised machine learning algorithms. Indeed, the advent of artificial intelligence may enable to accelerate at least key aspects of analysis routines which traditionally take weeks and months, to seconds or even milliseconds [67, 85, 86].

The fast-evolving field of all-optical physiology of neuronal microcircuits can only thrive in a multi-disciplinary environment and is critically dependent on each component to be optimized and ideally integrated. Yet, now is the time, now all individual advances can come together to make real-time closed-loop all-optical physiology a reality.

#### 8 Notes

1. Choice of indicators: Adapt the kinetics of your indicator to the respective firing properties of the neuronal populations of interest: i.e., in rather fast-spiking neurons, indicators with a slow off kinetics such as GCaMP6s would not be advisable, but, in cases of deeper structures with low SNR, slow indicators might be the better choice.


#### References


neural activity with cellular resolution in awake, mobile mice. Neuron 56:43–57


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Spatial and Temporal Considerations of Optogenetic Tools in an All-Optical Single-Beam Experiment

### Damaris Holder and Matthias Prigge

#### Abstract

All-optical experiments promise neuroscientists an unprecedented possibility to manipulate and measure neuronal circuits with single-cell resolution. They rely on highly fine-tuned microscopes with complex optical designs. Of similar importance are genetically encoded optical actuators and indicators that also have to be optimized for such experiments. A particular challenge in these experiments is the detection of natural firing patterns via genetically encoded indicators while avoiding optical cross-activation of neurons that are photon-sensitized to allow optical replay of these patterns. Most optogenetic tools are sensitive in a broad spectral range within the visible spectrum, which impedes artifact-free read-and-write access to neuronal circuits. Nonetheless, carefully matching biophysical properties of actuators and indicators can permit unambiguous excitation with a single wavelength in a so-called single-beam all-optical experiment.

In this chapter, we evaluate the current understanding of these biological probes and describe the possibilities and limitations of those tools in the context of the all-optical single-beam experiment. Furthermore, we review new insights into the photophysical properties of actuators, and propose a new strategy for a single-beam two-photon excitation experiment to monitor activity minimizing crossactivation with the actuators. Finally, we will highlight aspects for future developments of these tools.

Key words Channelrhodopsin, Two-photon excitation, Photobiophysics, Optogenetics, Crossactivation

#### 1 Introduction

Over the last few years, the development of advanced optical stimulation technologies in combination with novel optogenetic probes foretold a bright future for the interrogation of brain function in behaving animals. In particular, optically imprinting naturalistic firing patterns onto neuronal circuits while simultaneously obtaining activity readouts at a single-cell level is considered the holy grail of a holistic understanding of neuronal circuits [1–3]. In this chapter, we refer to such an experiment as an all-optical experiment and offer a review on the possibilities and pitfalls of two-photon stimulation of photon-sensitized single neurons while optically

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_6, © The Author(s) 2023

monitoring activity in the entire neuronal circuit. These days such experiments have mostly been performed in specialized laboratories with a long-evolved expertise in advanced two-photon microscopy or optogenetic tool development groups. They are, however, more difficult to implement in common neuroscience laboratories. To allow the execution of these all-optical experiments, both optical technologies and optogenetics probes need improvements in terms of allowing high signal-to-noise ratio (SNR) imaging and obtaining decreasingly flawed readouts. Furthermore, obstacles for most laboratories include the enormous costs of optical systems including two beam lines and optical modulators. Nevertheless, the overall idea for such an approach is straightforward: optical stimulation of single neurons expressing a sensitive blue-absorbing actuator in combination with a highly sensitive red- or IR-light absorbing indicator to report activity of single neurons in a large field of view (FOV) [4–6].

For a more extensive introduction into the optical design of these kinds of systems, several implementations have been employed and are discussed within this book (Chaps. 1, 2, 3, 4, and 5) or reviewed elsewhere [7]. Briefly, kilohertz resonance mirrors are commonly implemented for fast imaging, resulting in line scan speeds between 50 and 100 μs. Alternatively, acousto-optic deflectors (AODs) are of an order of magnitude faster and can yield line scan speeds of 10 μs and lower. With electrically tunable or ultrasound lenses and piezo- and acousto-optic elements scanning speeds can be drastically increased and 3D imaging and randomaccess scanning can be realized [8–10]. Furthermore, liquidcrystal-based spatial light modulators (SLMs) are used to extend imaging into the third dimension via computer-generated holography (CGH) [11].

CGH with liquid crystal SLMs is most often used to optically stimulate dozens of target cells in a three-dimensional brain volume on a single-cell level [5, 12–14]. Therefore, two strategies for single-cell optical stimulation have emerged in the last decade. The first strategy involves spiraling of the two-photon laser beam on the somata of neurons expressing an actuator. An increasing number of actuators are then activated along the spiral beam path on the surface of the membrane, and their single-channel photocurrents are integrated toward the depolarization of the entire cell, eventually leading to spiking of the neuron [15]. By combining SLM and galvanometric mirrors, such a spiral approach can easily be multiplexed [2]. Yet, latency and spike jitter are rather ill-defined and can vary around 20 ms and from 5 to 20 ms, respectively due to biological factors such as cell morphology, cell excitability, or expression level. Additionally, the temporal spike accuracy also depends on an exact technical implementation in terms of spiraling speed, determination of light powers, and axial resolution for the excitation of the actuator. Furthermore, how efficiently the photocurrent of a single-channel translates into neuronal depolarization depends on the off-kinetics of the actuator in use: slowclosing actuators integrate photocurrent over a longer time and therefore drive spiking more reliably in neurons. Nonetheless, slow-closing actuators increase the probability of multiple spikes during a single spiral scan.

The second approach for single-cell stimulation is the so-called scanless stimulation approach, which can be implemented using holographic stimulations [16, 17]. Here, an SLM is conjugated to the rear focal plane of the objective, and can generate multiple (circular) patterns to stimulate several entire somata simultaneously. Sub-millisecond jitter can be achieved with holographic stimulation [18, 19], and actuators with fast off-kinetics can be used [20, 21]. Such technologies for the precise control of spike timing in many neurons in parallel will be pivotal to probe computational principles of neuronal circuits or even to artificially drive animal behavior without sensory inputs [1, 22].

Thus in summary, a rapidly increasing number of optical technologies have been developed for neuroscientific applications. However, imaging and optical manipulation still face challenges on both technological and tool fronts as the channels for imaging and manipulation are spectrally distinct and therefore require two independent optical pathways with heavy weight and sensitive as well as expensive optical equipment.

1.1 Optogenetic Tools for All-Optical Experiments Several advances are being made in tool development as new and better indicators and actuators are constantly being designed. However, the most commonly used green-absorbing calcium indicators GCaMP6 through 8 are still superior to their red-absorbing counterparts jRGECO or K-GECO1 [23, 24]. It is believed that only the pairing of red indicators and blue-absorbing actuators will completely prevent cross-interference between imaging and stimulation. Yet, for the more commonly used combination of a bluelight absorbing indicator in combination with a red-absorbing opsin the actuator is always cross-activated during imaging, since retinal isomerization is triggered via the hypsochromic shifted excitation light (shorter wavelength relative to maximum absorption see Fig. 1a). This can elicit electronic transitions to higher electronic states (e.g., S0 ! S2), which are less efficient than those to a lower energy state (S0 ! S1) [25, 26]. To some extent, lower imaging light powers can mitigate such effects, but deeper imaging and in vivo applications are often antagonistic to low power imaging.

> Despite ongoing developments toward better red-light absorbing indicators, artifacts from photophysical processes induced by blue light imaging excitation in these red absorbing indicators still constitute a major obstacle for their widespread use in all-optical experiments.

Fig. 1 Spectral combinations of indicators and actuators in an all-optical experiment, and their respective cross-activation. (a) illustrates the overlap between the absorption spectra of GCaMP with the red-light absorbing opsin ChRimson. Here, during hypsochromic imaging of the indicator, higher electronic state transitions in the all-trans retinal can evoke photoactivation of ChRimson. While in (b) the absorption spectra of RCaMP and the action spectra of ChR2 are shown. Here, both spectra are sufficiently separated to avoid cross-activation of the blue-light absorbing opsin via bathochromic imaging, while contamination of RCaMP emission during opsin photoactivation is minimal due to low absorption of the red indicator for shorter wavelengths

For example, promising red-light absorbing calcium indicators such as jRGECO1a and RCAMP2, which originate from mApple, exhibit a blue-light sensitive protonation state giving rise to increased red fluorescence emission with similar kinetics as calcium-induced changes [23, 24]. Similarly, far-red calcium indicators, excited at approximately 590 nm, such as R-GECO and K-GECO also exhibit blue-light artifacts, albeit with faster kinetics than calcium-induced changes [27].

In the last decade, the rise of voltage indicators has additionally augmented the sensor landscape, however, the aforementioned bottleneck is only further enhanced by commonly used genetically encoded voltage indicators with a high SNR such as ASAP or Marina that also absorb in the blue/green spectral range [28, 29]. Additionally, a somewhat baffling finding is that promising candidates of opsin-based voltage indicators are not accessible via two-photon excitation due to their poorly understood photophysical processes related to the lifespan of the excited state [30]. Moreover, opsin-based indicators have inherited the aforementioned S2 hypsochromic excitation. Thus, although the collection of genetically encoded tools is constantly expanding and improving, an artifact-free and robust combination of indicators and actuators has yet to be established.

#### 2 All-Optical Single-Beam Experiments

A single-beam experiment refers to an approach in which a single laser beam at a given wavelength is multiplexed to simultaneously image and stimulate neuronal circuits. The most straightforward approach to this problem is to target two distinct cell populations with an actuator and indicator. A frequently used approach is to restrict the expression of actuators and indicators to different brain regions (anatomy-defined approach Fig. 2a). Here, cross-activation of the axonal projections in one target region, in which neurons express an indicator, is reduced, since photon-sensitized axons require higher light power to trigger an action potential than photon-sensitized somata [31]. However, such an approach

Fig. <sup>2</sup> Fundamental approaches for all-optical experiments. In (a) the expression of actuator and indicator are separated through space. In this anatomical approach, full-field one-photon illumination of axons still triggers the activation of actuators and hinders simultaneous imaging of indicators, but low power imaging of the indicator can reduce cross-activation of the axon drastically. Moreover, if expression of the actuator is restricted to the soma, photo-stimulation can be entirely spatially separated, rendering an artifact-free simultaneous all-optical experiment possible. (b) Illustration of a cell-type-defined approach in which different cell types express either an indicator or actuator, so that imaging and optogenetic stimulation are segmented into small subregions containing only specific cell types. The approach in (c) is a variation of (b): while using a distinct expression of indicators and actuators, here, technology to define arbitrary paths allows for the selection of neurons with indicator expression in the entire field of view. In (d), a typical dual-color all-optical experiment is illustrated. Here, both indicator and actuator are expressed in the same cell. In this case, the choice of indicator/actuator combination is crucial. Usually green-absorbing indicators are employed alongside red actuators to minimize optical crosstalk

necessitates identification of the light power regime for artifact-free imaging of indicators and carefully performed control experiments. The light power used to evoke the release of a neurotransmitter from photon-sensitized axons depends on several factors such as myelination, directionality of the axons relative to the plane of light, and axonal caliber [32].

A related anatomy-defined strategy is often used for in vivo all-optical experiments. Here, a brain region harboring photonsensitized somata is stimulated to elicit action potentials that propagate orthodromically to a projection region in which neuronal activity is monitored optically [32]. While such an approach considers the natural latencies and branching of axonal collaterals, it also requires restriction of actuator expression to the soma to avoid unwanted activation of axons. Additionally, for an in vivo application in a single beam experiment, the laser beam would have to be split and the resulting beamlets guided independently to two separate brain regions. Overall, both anatomy-defined strategies are limited to the study of axonal innervation in a specific brain region and are highly dependent on anatomical architecture.

For an all-optical experiment within the same region, actuator and indicator can be expressed in distinct cell populations, e.g., excitatory versus inhibitory neurons. In this case, a small imaging region of interest (ROI) containing few cells (e.g., excitatory neurons) devoid of actuator expression can be selected, while a subset of neurons expressing the actuator (e.g., interneurons) are holographically stimulated outside these ROIs (Fig. 2b).

Similar to the aforementioned anatomy-defined approach, a cell-type defined approach also requires strict soma-targeting of both actuator and indicator. Instead of limiting imaging ROIs to two small subsets, fast scanning approaches using AODs or resonant-galvo-galvo configurations allow an imaging path to be defined in which only neurons expressing indicators are scanned (see Fig. 2c). In particular, AOD technology allows one to traverse such an arbitrary pathway in the kHz range, making it ideal for use with fast voltage indicators as well as for calcium imaging [10, 28].

Importantly, anatomically and genetically defined approaches do not enable both activation and imaging from the same set of neurons. Ultimately, only simultaneous expression of both indicator and actuator can grant all read-and-write access to neurons but it is still particularly challenging to prevent cross-activation with a single laser beam.

Dual-color experiments are classically considered to be a bulletproof concept for an all-optical experiment. Yet, compromising on indicator SNRs due to less sensitive red-absorbing indicators make such an approach challenging in terms of detection (Fig. 2d). Even though dual-color experiments usually rely on two different laser beams, they can serve as an interesting framework and introduce valuable parameters to consider in a single-beam experiment.

¼ Fig. 3 Illustrations of scanning parameters and actuator stimulation patterns in an all-optical dual-color configuration experiment. (a) For a given x-y plane, the scanning mirrors move a laser spot across the field of view (FOV) with a given lateral rxy and axial resolution <sup>r</sup>z [33, 34]. (b) As an example, we assume a system with a 16x objective with NA <sup>¼</sup> 0.8, yielding a FOV of 700 - 700 <sup>μ</sup>m (dFOV); an 8-kHz resonant scanner, scanning with a resolution of 512 - 512 lines; and a laser with <sup>f</sup>repetition <sup>¼</sup> 80 MHz, and a single pulse width of <sup>T</sup>pulse <sup>¼</sup> 140 fs, where we scan a single neuron with a dimension <sup>D</sup>soma <sup>¼</sup> <sup>30</sup> <sup>μ</sup>m (in teal an idealized neuron, which was approximated to a rectangular scan area (gray shaded area) of 30 - 30 <sup>μ</sup>m for estimating parameters). Scanning the entire FOV takes Tframe with a single line scan of the duration Tline. We define the distance don-cell as the distance scanned by the laser on the cell and doff-cell as the distance the laser scans over non-opsin expressing parts of the FOV. The distance d is defined in y-direction as the distance between two scan lines. Depending on the resolution, the laser spot runs over a neuron expressing an actuator several times per frame. Ton\_cell is the time the laser spends scanning the cell for a given line. Depending on the position of the neuron within the FOV, the subsequent line can give rise to another illumination period, i.e., cell is located on the border of the FOV so that doff-cell is very small on one end of FOV. Within a single given frame, a neuron in our example will be exposed to <sup>N</sup>soma 22 lines with each of them having <sup>T</sup>on\_cell <sup>&</sup>lt; <sup>3</sup> <sup>μ</sup><sup>s</sup>

As a practical example shown in Fig. 3, the imaging beam (longer wavelength than that of actuator absorption) passes over neurons expressing the actuator and indicator with a specific lateral (rxy) and axial (rz) dimension. The spacing between consecutive scan lines (d) depends on the resolution of the acquired image (Rscan), and defines how many times (Nsoma) the laser passes an idealized soma (quadratic dimension given as Dsoma) during a single frame. Based on typical values used in all-optical experiments (shown in Fig. 3), the following relations apply:

$$d = \frac{d\_{\rm FOV}}{R\_{\rm scan}} = 1.4 \,\upmu\text{m} \tag{1}$$

Here, the spacing d between two consecutive lines is 1.4 μm and FOV denotes the field of view with respect to the entire laser scanning range, which is 700 - 700 μm in our example. Considering a 16x objective (0.8 NA) with a scanner speed (νscan) of 8 kHz and a resolution of 512 - 512, a laser scan line (dimension rxy) passes over a 30-μm (rectangular) neuron approximately 22 times:

$$N\_{\text{soma}} = \frac{D\_{\text{soma}}}{d} = \frac{D\_{\text{soma}} R\_{\text{scan}}}{d\_{\text{FOV}}} = 22 \tag{2}$$

During these 22 passes of the beam area of rxy over the cell surface, only 20% of the total membrane surface will be illuminated (we assume the idealized neuron to be a square of an area of 900 μm2 ):

$$A\_{\text{illuminted}} = N\_{\text{soma}} \ r\_{\text{xy}} \ D\_{\text{soma}} = 182 \ \mu\text{m}^2 \tag{3}$$

The scanning frequency defines the dwell time (Ton\_cell), which denotes the time the laser beam spends scanning a cell during each scan line of the bidirectional movement of the resonant scanner (example):

$$T\_{\rm on-cell} = D\_{\rm soma} \left( \nu\_{\rm scan} \text{ 2 } d\_{\rm FOV} \right)^{-1} < \mathbf{3 } \,\upmu\text{s} \tag{4}$$

In our example, the average dwell time per cell and line is roughly 3 μs. However, for a single opsin molecule, dwell times are even shorter (cross-sectional diameter approximately 2 nm). In our example, an opsin that diffuses within the plasma membrane with 0.5 μm/s can be regarded as immobile during the dwell time of 3 μs [35]. Each opsin will be exposed to several light pulses during one scan line, but once activated, an opsin molecule with its millisecond off-kinetic can only be triggered once during a single line, and depending on rxy and the FOV only once within a single frame acquisition. This is in drastic contrast to indicators that have fluorescent lifetimes of 2–10 ns, and can be excited and emit a single photon with every light pulse (approximately 10 light pulses when sweeping a beam across a single molecule). Therefore, the number of emitted photons indicative for the state an indicator is in can be further increased through prolonging dwell times in the microsecond range, while avoiding increased photocurrent through double-activation of an opsin with its milliseconds photocycle.

To better understand two-photon activation of actuators and indicators, two-photon cross-section (unit: Goeppert-Mayer; GM) data are invaluable to estimate cross-activation. Yet, GM values are rare for opsin-based indicators, though retinal-based actuators generally tend to have higher values than those of fluorescence proteins. As mentioned above, the light sensitivity and dynamic range of genetically encoded indicators are being periodically improved. Current versions allow two-photon imaging at low excitation powers. A recent report demonstrated that GCaMP6 can be imaged using two-photon light powers as low as 50–80 mW (using an 80 MHz Ti:Sapphire laser) and that this did not lead to any significant change in firing rates of neurons expressing a red-shifted opsin (λmax ¼ 540 nm) [22, 36]. Nevertheless, subthreshold depolarizations were not monitored, but likely induced and potentially biased ongoing basal excitability levels, thereby causing measurement artifacts in the neuronal circuit under investigation. Ultimately, further two-photon optimized indicators together with carefully selected opsins, as discussed below, can potentially enable single-beam all-optical experiments with little to no cross-activation. In particular, this includes a wellcontrolled expression level of opsins that is just sufficient to reliably trigger spikes holographically, while minimizing cross-activation when line-scanned.

#### 3 Temporal Considerations of Actuator Kinetics in Single-Beam All-Optical Experiments

The possibility to optically probe neuronal systems is tightly linked to the discovery and exploitation of light-gated ion channels [37– 39]. Channelrhodopsins are the main actuators used in neuroscience. The success of opsin-based actuators triggered an avalanche of protein engineering studies as well as metagenomic exploitation to design opsins for specific experimental settings [40–42]. This new hype around opsin research was accelerated by the groundwork laid by earlier studies on light-driven ion pumps (reviewed in [43]). Based on their established spectroscopic, crystallographic, and electrophysiological paradigms, in less than 10 years, causal relationships between channelrhodopsin structure and ion-conducting pathways could be drawn on an atomic level [41].

The sequence of conformational changes and ion movements is described in a photocycle (see Fig. 4). We now know that an

Fig. <sup>4</sup> Unified photocycle for Channelrhodopsin2 in relation to its electrophysiological parameters [44, 45]. (a) is a schematic depiction of the unifying photocycle where a single two-photon process can either trigger an anti-photocycle from the dark-adapted (DA) ground state or the transition to the light-adapted (LA) state. The LA state thermally relaxes back to the DA state or a second two-photon process can trigger a syn-photocycle. After relaxation to the open state(s), O1 or O2 the molecule transitions back to the closed ground state of the respective cycle. (b) Illustration of a two-pulse experiment in which two light pulses are given with increasing delay times (Δt) while monitoring the recovery of peak photocurrents during the second light pulse. The recovery of the peak photocurrent <sup>T</sup>rec after varying <sup>Δ</sup><sup>t</sup> obeys <sup>a</sup> monoexponential increase referring to the transition of LA to DA (dotted red line)

aqueous pore is formed between helix 1, 2, 3 and 7, and is guarded by two main gates. The pore opening is initiated via a so-called central gate, which is in close proximity to the retinal chromophore, and pre-opens the conduction pathway and allows influx of water molecules into the pore without conducting ions or protons. Only upon breaching the inner gate, ions and protons can be conducted along their electrochemical gradients.

The open time of ion channel pores, which has been a subject of mutational studies that led to many ChR variants, can range from milliseconds to seconds [42, 46]. After closing, the opsin returns to its ground state for a new photon excitation. Despite our increasingly detailed understanding of these processes, it is surprising how photocycles can differ among various ChR variants.

A unified photocycle derived from spectroscopic data also integrates a multitude of electrophysiological parameters such as photocurrent profile, on- and off-kinetics, inactivation and recovery, as well as light sensitivity (Fig. 4) [44]. Photocycles can exist in various forms, from a single photocycle consisting of one dark (D), an open (O), and a non-conducting state (P), to a dual photocycle consisting of two conducting states up to highly complex and branched photocycles [47–49].

¼ Indispensable for our understanding of branched photocycles has been Raman spectroscopy. This technique can provide a fingerprint of molecules based on their vibrational or rotational states, revealing an additional photo-induced rotation of the retinal chromophore in ChR between two conformational isomers: syn and anti [50]. These conformational isomers refer to the –C15¼NH– bond and are distinct from the photon-isomerization of the – C13 C14– cis/trans configurational isomers.

Such a second "light-switch" in an actuator has potentially interesting consequences for all-optical experiments in neuroscientific applications; in the complete dark-adapted state (DA), where all rhodopsin molecules harbor an all-trans/-C15¼N-anti isomer, absorption can either cause the molecule to enter into an antiphotocycle (all-trans/-C15¼N-anti ! 13-cis/-C15¼N-anti) or a syn-photocycle (13-cis/-C15¼N-syn ! all-trans/-C15¼N-syn) ending in the light-adapted ground state (LA) [44, 51]. Molecular dynamics simulation supports that the ground state in the syn-cycle is indeed a pre-opened central gate, which can only completely open the ion-conducting pathway through a second absorption process. The LA ground state can thermally convert back to DA in the range of seconds (seen as recovery kinetic τrec), a process that can be monitored with a two-pulse experiment (see Fig. 4b).

Despite the debate on the contribution of this syn-photocycle to the overall photocurrent [52], it is evident that such photocycles exist and can be subjected to further design efforts. Potentially, an opsin could be engineered to exploit the dual photon absorption process to initiate an efficient syn photocycle only after dual-photon processes. Such a dual absorption would strongly depend on Ton\_cell and the likelihood of a second absorption process from LA being triggered to start the syn-photocycle. The probability of such an effect will be very low within the time window of Ton\_cell. Additionally, the probability of a two-photon absorption process within Ton\_cell would quadratically decrease with light power. Engineering efforts would also have to be directed toward reducing the conductance in the anti-photocycle, that is, minimizing photocurrent and enhancing conductance in the syn-photocycle to reliably trigger action potentials during holographic stimulation. Holographic stimulation in the order of a millisecond would efficiently trigger a dual absorption process, and therefore trigger the syn-photocycle. After a single absorption event, the opsin molecule then thermally reverts from LA to DA with τRec.

Yet another interesting photophysical light switch is the so-called photoinduced closing of an opsin. As spectroscopists visualized different photo-intermediates through their different absorption bands, neuroscientists exploited these different states by illuminating light at the corresponding wavelength of a particular state during the photocycle, and therefore short-circuiting photo-intermediates directly back to their ground state. The efficiency to photo-induce an off-switch is highest when certain photointermediates are stabilized and have long dwell times. This has been shown for bacteriorhodopsin, the prototype of a light-driven pump, here single-point mutations can result in the accumulation of specific M and N photo-intermediates which can be photoconverted to the ground-state [53, 54]. Similarly for ChR2, mutations in the residues C128 or D156 in ChR2 generate a set of mutants in which opsins accumulate in an open state. These so-called step-function opsins can be turned on for hundreds of seconds with a short light pulse (in the millisecond range), thereby initiating the transition from the dark to stabilized open state. Because the absorption band of the open states is red-shifted, they can then quickly be turned off with red light illumination above 550 nm [46]. Therefore, a one-photon full-field background illumination with red light could potentially mitigate crossactivating the opsin during a two-photon all-optical experiment with a single laser imaging beam around 980 nm.

However, such photo-induced back reactions still remain poorly understood, particularly during two-photon absorption. It would be interesting to explore the possibility to excite deep-blue blue-absorbing ChRs such as PsChR or CheRiFF (λmax ca. 440 nm) with a 980 nm two-photon beam on the bathochromic (longer wavelength excitation relative to maximum absorption) side of their activation spectra. Eventually, cross-activated opsin molecules reaching O would then be back transferred with 980 nm excitation light.

Therefore, the matching of temporal properties of opsin molecules to imaging parameters can be utilized to minimize crossactivation. Here, off-kinetics that only allow a single activation of an opsin molecule during an imaging iteration in combination with indicators with short fluorescent lifetimes are favorable. In particular, Ton, Toff, and Trec of opsin molecules have been heavily engineered and are ranging from few milliseconds to seconds [6, 55]. In contrast, photo-induced back reactions are not well understood, but could potentially act as an optical master switch that can render opsin molecules photocurrent-effective or ineffective.

#### 4 Spatial Consideration of Optogenetic Tools in a Single-Beam All-Optical Experiment

For an all-optical experiment with a single laser line, a high SNR and single-cell resolution are essential. For stimulation of singleneurons within a large field of view with hundreds of neurons expressing photon-sensitive opsins, undesired off-target activation via their closely passing axons (and dendrites) along the soma of the targeted neuron becomes a challenge. To decrease background fluorescence as well as confine excitation to selected neurons, it is advantageous to localize the expression of actuators to specific compartments. Therefore, restricting the expression of opsins to the soma and dendrites has been a robust strategy.

Figure 5 gives an overview of different genetic targeting strategies along the neuronal cell body. Genetically fusing the opsin to

Fig. <sup>5</sup> Overview of cellular localization of different target sequences. The neuron is divided into four segments: axon initial segment, soma, somatodendritic and soma and proximal dendrites. Here, an overview is given over the different molecular targeting strategies which are employed depending on which of these segments are supposed to be expressing the opsin

ankyrin-G protein, couples opsin expression to spectrin-actin network, and consequently restricts expression to the somatodendritic region and axon hill. Despite the size of its targeting motif of more than 700 amino acids, the ankyrin-G sequence also targets the dendritic region and hence still gives rise to off-target activation [67]. However, targeting opsins to more defined and small subcellular regions such as the axon initial segment (AIS) has failed so far [56]. Early attempts to deploy targeting motifs found in voltagegated ion channels to anchored actuators in AIS such as NaV1.2 have been successful in terms of localizing the transgene to the AIS, but the number of opsins molecules was too small to optically induce action potentials [57]. A similar strategy utilizes a shorter targeting motif derived from NaV1.6 and localizes opsins sufficiently to the AIS, but also changes the intrinsic excitability within the host cell itself [58, 59]. Based on these insights, short targeting motifs to prevent expression within the axons are most promising. For example, fusing a targeting motif from kainate receptor subunit 2 to the N-terminus CoChR yielded a soma-oriented actuator (soCoChR) allowing for holographically triggered action potentials with a 1 ms resolution and minimal off-target activation [19]. Similarly, the cytoplasmic C-terminal from the voltage-gated potassium channel Kv2.1 yielded specific targeting of opsins to somata as well as proximal dendrites (stChronos, stCoChR, or stGtaCR2) [20, 21]. Both targeting motifs, NaV1.2 and Kv2.1, have been successfully used for two-photon connectivity mapping; however, direct comparison between motifs remains to be elucidated.

Clearly, restricting the expression of optical actuators to a fraction of the entire cell membrane leads to smaller photocurrent. Yet, the soma-targeted opsins exhibit larger photocurrents than do wildtype versions, indicating higher membrane expression of opsins when fused to the Kv2.1 motif [20, 60]. Furthermore, apparent off-kinetics of soma-targeted versions in neurons are faster than unmodified versions, since delayed axonal and distal dendritic current contributions are removed from the overall kinetics.

On a similar basis, targeting cytoplasmic indicators to the soma or nucleus can be beneficial. In particular, nucleus-targeted calcium indicators help to segment calcium transients to individual cells [61]. However, onset latencies are prolonged due to slower calcium rise in the nucleus. For single spike events, calcium transients might not propagate into the nucleus and the overall response sensitivity of calcium indicators is reduced in the nucleus.

Figure 5 gives an overview of different targeting sequences and the resulting expression regions. In a recent screening, novel and optimized targeting motifs have been reported [62]: a shorter ankyrin-based motif combined with an ER trafficking signal from Kir2.1 fused to the N-terminus restricts calcium indicators to the somatodendritic region (Fig. 5). In addition, a de novo synthesized coil-coiled peptide that self-assembles into a complex slows down transport out of the soma.

As cellular targeting helps to avoid cross-activation, a foreseeable future breakthrough will be the exact control of the expression level in relation to the membrane/volume ratio. Here, expression systems that encode an auto-feedback that restricts expression to the necessary amount will greatly reduce the effect of crossactivation with indicators [63].

#### Note 1: Beyond Temporal and Spatial Constraints: Ion Permeability

Not only spectral properties within a photocycle can be exploited for neuroscientific applications. Ion selectivity is a key feature to adapt ChR to specific neuroscientific experiments. As ChRs are intrinsically non-selective cation channels permeable to protons, sodium, potassium, and even calcium, they are not designed per se for the sensitive electrochemical ionic gradient in neurons. With an atomic structure of different ChRs at hand [56, 57], several protein engineering attempts yielded ChR variants with improved ion selectivity (Permeability, P). Particularly interesting for a neuroscientific application are PNa+/PH+ ratios. For example, variants such as ChRomeT (A71S/E90A/H114G), Chronos-D139H, or the naturally occurring opsin PsChR exhibit shifted ratios of ten to hundredfold to higher sodium permeability. Further, PCa2+/PH+ ratios have been modified to improve calcium conductance in ChR2 mutants such as ChR2-L132C, ChR2-S63D, or ChR2-L132C-T159C [6, 58, 59]. An increased sodium selective opsin translates photocurrent more directly into membrane depolarization in neurons.

In a branched photocycle, the high-conductance O1 state exhibits lower proton selectivity than does the low-conductance state O2, and therefore the initial peak photocurrent carries a larger fraction of sodium in its photocurrent. So far, attempts to modify ion selectivity of distinct open states remain unsuccessful. The structural changes that lead to different ion selectivities in the respective open state are not well understood. As both open states share the same ion-conducting pore, mutational analysis introducing small structural changes will likely influence the ion selectivity of both open states.

#### Note 2: Beyond Temporal and Spatial Constraints: Spectral Consideration

In an ideal all-optical experiment, actuators and indicators are sufficiently separated without any cross-activation and optical setups are equipped with two independent spectral laser lines. However, in-depth understanding of photophysical processes underlying wavelength-dependent absorption can help design better all-optical single-beam experiments.

So far, only phenomenologically understood are the findings that stationary photocurrents current saturate at lower light powers and can exhibit slightly shifted absorption spectra [64, 65]. Typically, for one-photon imaging peak photocurrents saturate at light powers higher than 15 mW/mm2 , whereas stationary photocurrents saturate already at less than 5 mW/mm2 [55]. However, for optimized opsins with large photocurrents, light intensities can be orders of magnitude lower [40]. Photocurrent profiles at low light powers exhibit a slow increase in amplitude that sometimes completely lacks any peak photocurrent due to lower absorption probability per time. This feature can be utilized in two-photon imaging, where indicators are monitored at low intensities and high scanning frequencies inducing small and skewed transient currents carried by a mix of O1 and O2 (see Fig. 3). For 2P optogenetic activation of neurons, high light powers will evoke large peak photocurrents and efficiently trigger action potentials. Therefore, for utilizing such a strategy, ChRs with strong and fast inactivation and strong expression are preferred. Similarly, activation of opsins outside their maximal absorption range can also lead to photocurrent profiles with reduced peak photocurrents and slow on-kinetics [66].

In summary, low power imaging and excitation wavelengths outside the maximal absorption band reduce transient peak photocurrents and therefore minimize cross-activation of opsin-based actuators during two-photon resonance imaging.

#### 5 Summary

To gain cellularly resolved read-and-write access to an entire neuronal circuit, photo-sensitizing proteins, indicators (read), and actuators (write) need to be expressed within a single neuron. Ideally, neuronal activity is monitored in a large field of view while multiple neurons are stimulated in parallel with sub-millisecond latencies. Here, scanless stimulation technologies for multicell

Fig. 6 Comparison of different stimulation approaches for a single-beam experiment. (a) outlines the spiral stimulation approach, while (b) elucidates the holographic approach, both combined with fast scanning indicator imaging. The grey inserts outline the methods' respective advantages and drawbacks

excitation, such as holographically multiplexed spiraling or sole holographic stimulation with spots fitting the size of a neuron, are used to excite actuators, whereas indicators are imaged using fast scanning approaches (Fig. 6). Since an ideal combination of spectrally distinct and high-efficiency indicator/actuator pairs remains unavailable and would require expensive setups with multiple laser lines and a highly complex optical design, in this chapter, we review unexplored spatial and temporal photophysical features of spectrally comparable indicator and actuator pairs permitting an all-optical single-beam experiment with minimal cross-activation.

Indicators are constantly being improved in terms of higher SNR and better imaging properties. However, mutational screening for a new generation of indicators with improved two-photon cross-sections is rarely performed. Within our introduced theoretical framework, we demonstrate that each actuator opsin molecule is only activated once during a laser beam sweep. In contrast, indicators can emit hundreds of photons during the same time window. Therefore, next to improving the two-photon cross-section of indicators to allow for efficient activation, decreasing their fluorescence lifetimes to below 5 ns can help distinguish between indicator and actuator activities. Spatial restriction of indicator expression within the soma can further help prevent background fluorescence arising from the neuropil, thereby decreasing the imaging contrast.

With regard to actuators, we have highlighted the benefits of using moderate or slow off-kinetic opsins for spiraling approaches because they more efficiently integrate single-channel ion conductances towards crossing the spiking threshold. Fast and complete desensitization from peak to stationary photocurrents reduces the probability of multiple action potentials being triggered during a single spiral scan. In this case, high membrane occupancy of actuators is advantageous since only a fraction of the entire membrane is illuminated during a spiral scan.

In contrast, holographic stimulation concurrently activates the entire opsin-packed neuronal membrane with a millisecond illumination. Such an activation eliminates the need to sum over singlechannel conductances and favors the use of actuators with fast off-kinetics. To avoid cross-activation via the imaging beam, we suggest choosing blue-shifted opsins relative to the imaging laser beam. Thus, the laser light will only cross-activate actuators with a decreasing red spectral flank, reducing the overall probability of activation. However, holographic stimulation can still efficiently trigger action potentials via such a bathochromic excitation, given high enough light powers.

As the basic photophysical processes of light-gated ion channels are already well understood, we mainly focused on photophysical processes in the context of two-photon excitation. Further, we discussed photo-induced processes beyond the classical all-trans ! 13-cis isomerization in channelrhodopsins. Particularly, we describe a second two-photon absorption process based on an anti/syn conformation.

Harnessing any double two-photon absorption to activate opsin molecules in an all-optical experiment would render the light power dependency quadratic for actuators rather than for indicators. Furthermore, just like in the case of indicators, highthroughput mutational analysis for two-photon optimized opsin variants is virtually absent with this being particularly true for opsin-based actuators. Here, screenings toward high two-photon cross-sections or the facilitation of double two-photon absorption processes could enable artifact-free imaging of indicators and actuators with a single-beam line.

As previous years have demonstrated how prolific and diverse opsin evolution has been through natural selection, future years will reveal whether artificial screening toward two-photon optimization will produce a plethora of actuators for all-optical experiments.

#### Acknowledgments

The study was supported by the German Leibniz Association "Best Minds" program, and the Center for Behavioral Brain Sciences. The authors thank Nicole D'Souza and Eirini Papagiakoumou for their help during the preparation of the manuscript.

#### References


of ReaChR. Front Cell Neurosci 10. https:// doi.org/10.3389/fncel.2016.00234


fluorescent genetically encoded calcium ion indicators. Neuroscience


Optogenetics: 10 years after ChR2 in neurons – views from the community. Nat Neurosci 18: 1202–1212. https://doi.org/10.1038/nn. 4106


photocycle of channelrhodopsin-2 by an interhelical hydrogen bond. Biochemistry 49: 267–278. h t t p s://doi.o r g/10.1021/ bi901634p


Mol Biol 330:553–570. https://doi.org/10. 1016/S0022-2836(03)00576-X


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ( , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Miniature Multiphoton Microscopes for Recording Neural Activity in Freely Moving Animals

## Baris N. Ozbay, Gregory L. Futia, Ming Ma, Connor McCullough, Michael D. Young, Diego Restrepo, and Emily A. Gibson

#### Abstract

Miniaturized head-mounted microscopes for in vivo recording of neural activity have gained much recognition within the past decade of neuroscience research. In combination with fluorescent reporters, these miniature microscopes allow researchers to record the neural activity that underlies behavior, cognition, and perception in freely moving animals. Single-photon miniature microscopes are convenient for widefield recording but lack the increased penetration depth and optical sectioning capabilities of multiphoton imaging. Here we discuss the current state of head-mounted multiphoton miniature microscopes and introduce a miniature head-mounted two-photon fiber-coupled microscope (2P-FCM) for neuronal imaging with active axial focusing enabled using a miniature electrowetting lens. The 2P-FCM enables three-dimensional two-photon optical recording of structure and activity at multiple focal planes in a freely moving mouse. Detailed methods are provided in this chapter on the 2P-FCM design, operation, and software for data analysis.

Key words Two-photon microscopy, Fluorescence microscopy, Lens system design, Fiber optics, Ultrafast pulse propagation, Neural imaging, Image processing

#### 1 Introduction

Two-photon laser scanning microscopy (2P-LSM) has opened up opportunities for in vivo biological imaging at the sub-cellular level [1, 2]. In the neuroscience field, 2P-LSM combined with fluorescent indicators can now capture brain activity at high resolution and in real time. Although ingenious methods to study behavior and perception in a head-fixed animal have been developed, there are some behaviors that simply cannot be studied. Examples include navigation and social behaviors, while additionally there is evidence that head fixation can alter behavior and underlying neural processes [3]. Due to these constraints, head-mounted microscopes

r are now being developed that can allow the animal to be unrestricted with more naturalistic behavior. The first miniature microscopes to be developed for this purpose were one-photon widefield miniscopes [4–7]. These systems rely on innovations in compact CMOS imaging sensors and efficient visible wavelength light emitting diodes for the excitation source. Despite the success of the miniscopes with thousands deployed in laboratories internationally, there are limitations. One-photon widefield excitation in tissue results in high levels of light scattering, reducing signal to noise as the out-of-focus fluorescence is also collected on the imaging detector. Computational methods are typically used to select out transient fluorescent signals from the background from the image data. Detailed structural information of the brain region being imaged is sometimes not possible to obtain. Recently, modified miniscopes have been developed that place a phase plate [8] o microlens array [9] in the optical detection path which provides additional information for computational reconstruction in three dimensions (3D). However, light scattering, particularly in densely labelled samples, ultimately limits the imaging depth in these 3D miniscopes.

r In contrast to one-photon widefield, two-photon microscopy removes the out-of-focus fluorescence and provides cross-sectional imaging only at the focus. 2P imaging provides excellent signal to noise and detailed structural information allowing high-resolution images of cells and processes. Previous work demonstrated miniature laser-scanning multiphoton fiber-coupled microscopes for head-mounted neuronal imaging in freely moving rodents using different methods of laser scanning including miniaturized microelectromechanical systems (MEMS) actuated mirror [10, 11] o piezoelectric scanner [12] within the head mount. However, none of these designs leverages the optical sectioning capabilities for full volumetric 3D imaging without introducing cumbersome complexity and weight. Recently, Zong et al. integrated an optical design using a MEMS mirror for lateral scanning and a tunable lens for axial scanning in a head-attached miniature microscope [13, 14]. However, this miniature microscope is complex to align and involves custom lenses and tunable optics. In contrast, the design reported here incorporates a miniature, compact electrowetting tunable lens (EWTL) for increasing the versatility of two-photon capabilities by adding active axial focusing, while using a coherent imaging fiber bundle for lateral scanning. The design uses only commercially available components and can be implemented on any standard bench-top laser scanning microscope.

#### 2 Background

2.1 Optical Design Considerations for Miniature 2P Microscopes

The objectives used for two-photon excitation microscopy in deep tissues typically have a numerical aperture (NA) greater than 0.8 [2]. Imaging with high NA objectives reduces the spot size of the laser, thereby increasing the two-photon fluorescence signal for a given excitation power. However, large NA objectives are a problem for the manufacture of small imaging systems, because the system aperture is constrained by the diameter of the physical optics, which in turn forces the optical design to compromise on other important parameters, such as working distance and field-ofview [15]. Assuming a fixed aperture size, larger NA lenses require a reduction in the working distance. Additionally, the complexity of a lens design can be viewed as a function of its optical invariant, i.e., ability to collect light over a large angle and field-of-view (FOV). Increasing the NA has the effect of making it more challenging to design for a large FOV as well. When considering the fluorescence signal detection path, the fluorescent photons are only emitted at the focus and detected on a photo-multiplier tube (PMT) such that the background noise of the image is not degraded by scattered photon trajectories. Therefore, the excitation NA and the collection NA, more commonly referred to as Etendue, must be considered separately. The time-averaged intensity of fluorescence collected geometrically is:

$$
\langle \mathbf{I}\_{f,\textup{col}} \rangle \cong \langle \mathbf{I}\_{f,\textup{gen}} \rangle \mathfrak{Q}\_f \tag{1}
$$

where Ω<sup>f</sup> is the fractional collection efficiency of an imaging system, summarized as:

$$\Omega\_f = \frac{1 - 1\sqrt{1 - \left(\frac{\text{NA}\_{\text{col}}}{n\_1}\right)^2}}{2} \tag{2}$$

When imaging into scattering tissue, the excitation NA is not as important as the collection NA. This is because high-angle rays have longer optical paths, with higher probability of scattering, and are affected by aberrations. Additionally, there are diminishing returns as the focal volume decreases with higher NA. Therefore, a good efficiency two-photon imaging system can be designed that has a relatively low NA (~0.4–0.5) excitation path if the collection NA path can be increased independently. Such systems are often employed by using light collection paths that involve large-area detectors, which can be replicated with large effective-area optical fibers in miniaturized microscopes [10, 16].

#### 2.2 Optical Design Considerations for Axial Scanning with a Tunable Lens

Unlike methods of axial scanning that physically move the sample or the objective lens, the use of a tunable lens to adjust the axial focus changes the way light propagates through the optical system. More specifically, the curvature at the back focal plane of the objective lens is modified. When propagated through a lens, this curvature of the wavefront gets transformed into an axial displacement of the paraxial focus spot. This is analogous to how a tilted wavefront becomes transformed by a lens into a transverse displacement of the focus.

One important consideration when designing such a system is the magnitude of curvature that is needed to achieve a desired axial scan range. An EWTL placed in the back focal plane of a lens directly shapes the wavefront entering the lens.

The wavefront phase term ϕ can be written as a function of the pupil radius ρ [17]:

$$\Phi(\rho) = -\text{k}\mathbf{Z}\_{\varepsilon} \frac{\text{NA}^2}{2\text{M}^2} \rho^2 \tag{3}$$

where M is the objective magnification and Z<sup>c</sup> is the distance at which the focus is formed with a given quadratic phase. Rearranging this in terms of the amount of axial change for a given change in wavefront phase curvature gives:

$$\frac{\Phi}{\mathcal{Z}\_{\varepsilon}} = -\frac{\rho^2}{2\mathcal{M}^2} \text{kVA}^2 \tag{4}$$

This last expression shows that the ability of a tunable lens to change the axial focus of a lens system is dependent inversely on the square of the objective magnification. This means that designing for high magnification with the purpose of increasing resolution or NA will result in a decrease in axial scan range for a given EWTL.

Another consideration is that any imaging system that produces magnification cannot perfectly represent both the axial and transverse focus transformations simultaneously. This effect is illustrated in Fig. 1. Botcherby et al. describe the consequences of this on imaging properties [18]. In lens design, optical engineers typically optimize for either transverse or axial focusing. Most commonly transverse imaging is preserved because imaging lenses are expected to have uniform performance throughout the field-of-view (FOV). The consequence is an accumulation of spherical aberrations along the axial range of the lens, shown in Fig. 1c. One can also design for the Herschel condition, which instead forms perfect focus spots along the optical axis but accumulates aberrations in the transverse dimension.

The FCM imaging system designs presented here are optimized primarily for the Herschel condition, by constraining the optimization parameters in Zemax optical design software to

Fig. 1 Axial focusing by wavefront curvature shaping at the objective back focal plane. (a) An input converging or diverging wavefront is transformed into an axial translation away from the designed focal length of the lens. A lens producing an ideal diffraction limited focus at the design working distance (b) will not produce a diffraction limited focus when the input wavefront is divergent (c) due to spherical aberrations as a result of the Abbe sine condition

maintain consistent imaging properties throughout the full focal range of the imaging system. One important characteristic that is maintained is telecentricity, which ensures that there is nearly no magnification change as the focal length is tuned. However, this choice sacrifices some imaging performance in the transverse dimension, and so imaging performance is maintained only up to the expected FOV. Further, it is necessary to balance high magnification, which is required to achieve high resolution, with axial scanning range. In the end, the relationships between aperture size, focal length, transverse and axial consistency, and magnification become a careful interplay of compromises to reach a useful solution for miniaturized imaging. In this work, the actual optical designs are sought empirically based on initial designs and guiding principles. The designs are restricted by the parameters for size, resolution, FOV, and axial scan range and additionally restricting the designs to commercially available optics to allow for easier replication and affordability.

2.3 Cranial Windows Cranial windows provide optical access to surface-level brain structures with traditional microscope objectives. Typically, two-photon laser scanning microscopy (2P-LSM) is performed on a head-fixed mouse with a coverslip implanted in a small craniotomy above the brain region of interest, normally 3 mm<sup>2</sup> or less in size. There are several variations on windows that show the flexibility of this technique [19–21]. A similar procedure is skull-thinning and polishing, which can be healthier and stable for the brain tissue, but is timeconsuming and may not be as optically clear [22]. Recently, Wang et al. demonstrated that application of index-matching epoxy to the bone can reduce scattering and enhance optical access [23]. Removable windows and permeable windows have also been described [24].

> Some examples of common regions accessed by cranial windows are olfactory bulb [25], barrel cortex [26], and visual cortex [27]. These regions display high levels of behavior-dependent neuronal activity relatively close to the surface. The extended depthlimit of 2P-LSM, compared with widefield microscopy or confocal laser scanning microscopy, allows access to neurons about 500 μm below the brain surface. In terms of cortical layers in mouse, 500 μm is approximately the depth of the cell bodies found in layers 4/5 of structures such as the motor and somatosensory cortex. Three-photon excitation microscopy has been recently demonstrated to reach structures such as the hippocampus, located around 1 mm below the surface in the mouse brain. However, for deeper structures, it is necessary to excavate the tissue above the target structure or to implant relay optics, such as GRIN lenses [28].

2.4 Gradient-Refractive Index (GRIN) Lenses

GRIN lenses have been classically used for simple and spaceefficient collimation of the divergent light exiting a single fiber core. In the past decade, GRIN lenses have been re-purposed for miniaturized imaging applications [29, 30]. The radial profile of the refractive index, ng, in a GRIN lens varies according to [31]:

$$n\_{\mathcal{J}}(r) = n\_{\mathcal{J},0} \left( 1 - \frac{\mathcal{J}^2 r^2}{2} \right) \tag{5}$$

where g is the gradient constant, r is the radial distance from the optical axis, and ng.0 is the index of refraction at r ¼ 0. Rays launched into one end of the GRIN lens will be focused based on the axial length of the lens, defined by the pitch length, pL:

$$p\_L = \frac{2\pi}{\mathcal{J}}\tag{6}$$

Fig. 2 GRIN lens illustrations. Top: Single GRIN lens with pitch ~0.5 with a magnification of 1. Bottom: Two-element GRIN lens with low NA relay at pitch ~0.75 and high NA objective with pitch ~0.25

A pitch of 1 is a full sinusoidal period at a given wavelength. GRIN lenses are also available in a variety of NA values, which are defined by the focus angle in the external medium with index n1. This value can be approximated as [31]:

$$\text{NA}\_{\text{grin}} \cong \frac{n\_{\mathcal{Y}, \mathcal{Q}} gd}{n\_{\mathcal{I}} \csc(\mathcal{g}\mathcal{L})} \tag{7}$$

where d is the GRIN lens diameter. These expressions show that GRIN lenses with smaller diameter, d, need a shorter pitch length to achieve the same NA. To achieve a long and thin GRIN lens for deep-brain implants at high NA, it may require a very high pitch. To get around this constraint, a relay lens of lower NA is used in front of a high NA lens. Figure 2 shows an illustration of two kinds of GRIN lens assemblies used for imaging. A singlet lens is shown compared to a two-element lens with a low NA relay. By increasing the pitch number of the low NA relay, GRIN lenses of various lengths can be constructed for reaching deep-brain structures.

GRIN lenses are engineered to be highly space-efficient, achieving relatively high NA (up to 0.5) for the small diameters (0.35–1.8 mm). The disadvantages are that they are highly wavelength-dependent and have rapidly deteriorating off-axis performance for imaging. This is mainly due to the small aperture, which vignettes the off-axis rays, effectively reducing the NA toward the edges of the lens. Longer GRIN lenses also suffer from accumulated geometric aberrations. Methods to extend the imaging field and resolution include using active adaptive optics [32] and custom fabricated microlenses [33].

Even with the disadvantages, GRIN lenses offer a tool for providing optical access to otherwise deep-brain structures. Deepbrain implants of GRIN lenses have been used with both widefield fluorescence microscopy and 2P-LSM in structures such as the hippocampus [34]. Animals may be head-restrained with an appropriate objective used to image near the top surface of the GRIN lens. Usually a baseplate is implanted on the skull to provide stability to the fragile GRIN lens assembly.

#### 2.5 Fiber-Coupling of Excitation Laser

For 2P-FCM imaging a laser-source and point-scanning system are required. For the laser source, high-intensity laser light must be brought into the miniature microscope enclosure using an optical fiber. Fibers can be selected for the most efficient propagation of the laser light, generally such that the fiber-core supports only the fundamental transverse electromagnetic (TEM00) mode, in which case it is known as a single-mode fiber. In the special case of ultrashort laser pulses required for 2P-LSM, the maintenance of the peak pulse power is dependent on chromatic dispersion, modal dispersion, polarization dispersion, and non-linear effects [35]. Pre-conditioning the pulses can help to cancel these different effects of propagation through the fiber to maintain high peak powers [36–38]. Alternatively, it is possible to use hollow-core photonic bandgap fibers [39], in which the electric field exists mostly in an air-filled core, minimizing chromatic dispersion and nonlinearities. The choice of fiber should also depend on its weight, flexibility, and cost for the practical purposes of mouse imaging. Regardless of the choice, a single fiber core must be focused onto the sample and actively scanned to form an image.

Fiber-coupled scanning microscopy is a relatively welldeveloped field, owed to the progress in endoscopic surgical and diagnostic clinical tools that use miniaturized imaging heads coupled with fiber-optics to a detection system [40]. There are many techniques for accomplishing this, two of the most common being piezoelectric resonance scanning of the fiber-tip [12, 41–44] and integrated microelectromechanical systems (MEMS) actuated mirror scanning [10, 11, 45].

An extra consideration for these single-fiber delivery systems is that single-mode fiber-cores are inefficient for the collection of fluorescence emission. Solutions include the use of large mode-area fibers flanking the excitation core [16], miniature single-element detectors mounted on the microscope head [46], and a separate emission path for large-area collection fibers [10, 11, 47, 48].

Overall, these single-fiber-delivery techniques allow access to high-efficiency laser-transmission, which is especially useful for 2P-LSM because of the difficulty required in maintaining ultrashort pulse integrity at the focus. The cost of these techniques is the added mechanical complexity required to implement miniaturized laser-scanning. These may result in challenges in the optical design that limit imaging performance. Frequently, it is difficult to achieve large scan ranges or maintain large beam apertures using small mirrors. So far, these systems have been discussed only in the context of transverse scanning. Miniature fiber-scanning microscopes that implement a third axial scanning dimension can become overly cumbersome in their complexity.

Fig. 3 Properties of coherent imaging fiber bundles. Left: Illustration of proximal-to-distal image coherence of CIFB, as well as the pixelation of an image formed through the bundle because of the discrete fiber cores. Right: Fujikura CIFB fluorescence image of a uniform target showing core distribution. Image was acquired using a laser scanning microscope focused on the proximal surface of the CIFB and placing a uniform fluorescent sample at the distal end

#### 2.6 Coherent Imaging Fiber Bundles (CIFB)

A CIFB is made from a longitudinally ordered bundle of optical fiber preforms that are drawn into dense canes with only a small amount of cladding between the cores [49]. The resulting CIFB is spatially coherent on both ends, allowing a pattern of light on one end to be represented by the cores on the other end, illustrated in Fig. 3.

CIFBs can be coupled with a variety of simple imaging lenses to allow for full-field imaging. One of the simplest methods is to use optical epoxy to adhere an imaging GRIN lens (pitch ~0.5) to the distal end of the fiber bundle for use as a handheld imaging probe [50]. An image of the Fujikura CIFB is shown in Fig. 3 with one region expanded to better visualize the core distribution. Laterallaser scanning microscopy using a CIFB can be accomplished by simply scanning the excitation laser focus across the proximal surface of the fiber. The light is sequentially coupled into each core and transmitted to the distal surface. The distal surface can then be imaged onto a sample to create targeted excitation of fluorophores. The emission light can be collected through the same optical path. This setup is convenient because all the filters, detectors, and scanning optics can be located proximal to the fiber bundle, and can potentially be those of a standard bench-top microscope.

Even with the gained simplicity, there are some issues with using a CIFB for miniature laser-scanning microscopy:

<sup>l</sup> The CIFB introduces pixelation to the image due to the necessary cladding between the fiber-cores. Some techniques can be used to process the images to reduce this artifact, which can be summarized by different forms of low-pass spatial filtering [51–53]. However, the fundamental lateral resolution limit of the imaging system is determined by the core-to-core spacing and magnification of the imaging system.


While appreciating these challenges, CIFBs greatly simplify the distal optical setup, which is desirable when implementing the distal axial focusing mechanism. Additionally, the use of a CIFB may greatly reduce the cost of adoption for fiber-coupled microscopes because it may allow researchers to use an existing bench-top laser scanning microscope for fiber-coupled imaging with few or no modifications.

The time-averaged two-photon fluorescence intensity generated by a pulsed laser source with average power Pavg, pulse width τP, and pulse repetition rate fP is given by the following equation:

$$
\langle \mathbf{I}\_{\rm f,gcn} \rangle \cong \pi \mathfrak{G}\_2 \mathfrak{n} \frac{\mathbf{P}\_{\rm avg}^2}{\pi\_\mathbf{P} \mathbf{f}\_\mathbf{P}} \left( \frac{\mathbf{NA}^2}{\hbar c \lambda\_{\rm exc}} \right)^2 \tag{8}
$$

where δ<sup>2</sup> is the two-photon cross-section, η is the quantum efficiency of the fluorophore, h is Planck's constant, and c is the speed of light [56, 57]. This relationship shows that two-photon fluorescence signal is inversely proportional to the laser pulse duration. Chromatic dispersion through an individual core of the CIFB results in temporal broadening of the pulse, significantly reducing the fluorescence signal from two-photon excitation [58]. Figure 4 illustrates the principle of chromatic dispersion. Due to the spectral bandwidth of the pulse, the different frequencies travel with different velocities so that, at the exit of the fiber, the pulse is spread out in time. In normal material dispersion, the lower frequencies (red

2.7 Ultrafast Laser-Pulse Propagation Through Fiber

Fig. 4 Illustration of material dispersion. The initial pulse out of the laser is short with all spectral frequencies in phase. Inside the 1 m glass fiber, the different frequencies travel with different velocities, resulting in a pulse that is stretched out in time. The resulting pulse is positively chirped, where the lower frequencies (red) travel faster than the higher frequencies (blue)

wavelengths) travel faster than the higher frequencies (blue wavelengths) and the resulting pulse is positively chirped. Dispersive optics can be used to counter-act the material dispersion. One method is a Treacy grating-pair compressor, which uses two parallel gratings to spectrally disperse the light and recombine it in a geometry where the blue light travels faster than the red [59]. Addition of the grating-pair compressor before the fiber can effectively cancel out material dispersion, resulting in a short pulse at the fiber exit.

In addition, the high peak powers required for efficient two-photon imaging result in non-linear effects in the fiber. In the nonlinear regime, there is an intensity-dependent change in refractive index of the material, called the optical Kerr effect:

$$\mathbf{n}(I) = \boldsymbol{\mathfrak{n}}\_0 + \boldsymbol{\mathfrak{n}}\_2 \cdot I \tag{9}$$

where n0 is the linear refractive index and n2 is the second-order nonlinear refractive index of the material.

As a pulse propagates in a material, the Kerr effect results in a time varying refractive index, and therefore a modulation in the phase of the electric field, called self-phase modulation (SPM), resulting in a frequency shift that varies across the pulse, according to the following equation:

$$
\Delta\alpha(t) = -\frac{2\pi L}{\lambda\_0} dn/dt\tag{10}
$$

where Δω is the shift in the instantaneous frequency across the pulse in time, L is the length of the fiber, and n is the refractive index. Typically, SPM results in spectral broadening when propagating a high-peak power short pulse in fiber; however, in the special case where the pulse starts with a negative chirp, the frequency shift from SPM can cause spectral narrowing [60], resulting in a longer pulse duration, due to the time-bandwidth product. High-peak power pulse propagation through optical fibers for multiphoton imaging has been explored extensively in previous works [61–64]. One solution to obtain short pulses at the fiber exit is to first spectrally broaden the pulse by propagating in a polarization maintaining fiber (PMF) before applying negative chirp from the Treacy grating-pair compressor. As the pulse propagates through the CIFB, the spectrum narrows but is balanced out to obtain the initial spectral width from the laser, resulting in the bandwidth to support a short pulse. In certain cases, it is possible to obtain a pulse that is shorter than the original pulse duration by generating more bandwidth.

In addition, the high-peak power pulses incident on the proximal glass surface of the CIFB also produce unwanted background from auto fluorescence and inelastic Raman scattering [65]. We find that the background signal from the fiber surface is greatly reduced when the pulse is pre-chirped with lower peak intensity, thus resulting in fewer non-linear interactions at the proximal fiber surface.

Furthermore, the variability in diameter, shape, NA, and amount of cladding between adjacent cores warrants an investigation into the feasibility of propagating ultrashort pulses in a CIFB for two-photon imaging. Figure 5 shows a comparison of an image of a uniform fluorescent target with one-photon versus two-photon excitation through a CIFB. As can be noted, there is more heterogeneity in signal for the two-photon case. The issue of core-to-core variability in transmission of short laser pulses through commercial CIFBs was further explored by Garofalakis et al. by measuring the transmitted power, polarization, mode, and pulse duration from different fiber cores. The authors found that variation in pulse durations was the predominate cause of this heterogeneity and concluded that it resulted from a variation in the amount of chromatic dispersion in different cores [66]. Further work to understand and correct for these effects could improve multiphoton imaging through CIFBs.

Fig. 5 Imaging heterogeneity in CIFB. Images taken using the 2P-FCM showing the raw image of the proximal end of the CIFB focused on a NIR phosphorescent detector card to demonstrate one-photon excitation and a uniform green fluorescent test slide to demonstrate two-photon excitation

#### 2.8 Tunable Focus by Liquid Lens Technology

Liquid lenses are a common variety of electrically tunable lens (ETL) that use one or more liquids as a shape-changing refractive interface to allow for focus adjustment with minimal mechanical motion [67]. There are several common liquid ETL types that may be used for performing high-speed optical focusing for laserscanning microscopy [68]. Large-aperture liquid ETLs are particularly suited for bench-top systems because of their large focal range, high speed, and repeatability. An example of a commercially available 10-mm aperture, 30-mm diameter ETL (EL-10-30-C-MV, Optotune AG, Switzerland) functions by rapidly changing the volume of liquid in a container with a flexible polymer surface, which results in a focal length shift. Large aperture liquid ETLs have enabled rapid axial focusing, up to 100 Hz, in compact laser scanning confocal microscopes (LSCM), 2P-LSM, and selective plane illumination microscopy (SPIM) systems without mechanical movement of the objective or sample [69–72]. However, shapechanging polymer ETLs are not suitable for miniaturized headattached microscopes because of their large size and susceptibility to orientation and vibration.

Electrowetting tunable lenses (EWTLs) are another type of liquid ETL that has applications in microscopy [73]. An example is the Arctic 316, made by Varioptic, France, with outer diameter 7.8-mm and 2.5-mm aperture. Electrowetting is a method for changing the wettability of a liquid on a dielectric surface by applying a voltage across the interface, effectively changing the contact angle of the liquid with the surface. To form a lens, two fluids with dissimilar refractive-indices and hydrophobicity are placed in a container with dielectric sidewalls. The hydrophilic polar liquid has dissolved impurities to allow it to react to an external electric field. The contact angle of the liquid interface to the sidewalls of the lens can be changed by applying an electric field across the lens. Eventually an equilibrium is reached between the electrostatic forces acting on the polar liquid and the surface tension of the system to create a lens surface with a stable curvature.

Carefully matching the density of the liquids at these scales makes EWTLs very resistant to the effects of gravity due to orientation and vibrations, which makes it ideally suited for a miniature head-mounted system. Commercial EWTLs have also demonstrated very large tuning ranges, on the order of 50 diopters. Further, the responses of EWTLs to input voltages are well described by simple oscillators. Although EWTL focusing speed is not as rapid as some other ETLs, by engineering the voltage input function, the lens response time can be brought to less than 20 ms, which is within the range of what is required for an active scanning system [74]. EWTLs also have the potential to perform extended optical functions, such as beam steering and wavefront shaping [75, 76]. Because of the small aperture size, these lenses are not frequently used in bench-top microscopy applications. However, EWTLs are good candidates for robust axial focusing solution for miniature microscopes.

#### 3 Methods

This section describes in detail the design, construction, and testing of the two-photon fiber-coupled microscope (2P-FCM), with axial scanning enabled by an integrated EWTL and lateral scanning achieved with the use of a CIFB. The use of an EWTL is ideal for this application as it is lightweight, is compact, has low-power requirements, and is immune to motion and orientation [68, 77]. The optics are packaged in a light-weight 3D-printed enclosure. Pulse propagation through the CIFB is controlled by careful pre-compensation of the dispersion of glass and verified by spectrally resolved auto-correlation measurements. Testing and verification of the 2P-FCM performance is done using imaging resolution test targets and fluorescent beads in thick agarose preparations. Finally, we describe how to image in vivo neuronal activity by 3D-imaging of neurons in the motor cortex of a freely behaving mouse and in the piriform cortex using a GRIN relay lens combined with the head-attached 2P-FCM. A baseplate is permanently affixed to the mouse skull for attachment and alignment for repeated imaging over the same brain region.

#### 3.1 Overall System Design

The experimental setup for imaging with the 2P-FCM is shown in Fig. 6. The distal optics of the 2P-FCM are housed in a two-part 3D-printed enclosure, which can be repeatedly attached to a baseplate affixed to the animal's head. A CIFB couples the distal imaging optics to a custom 2P-LSM system to relay the excitation laser and collect emitted fluorescence from the sample. Laser-scanning over the CIFB forms an image of the sample while the fluorescence

Fig. <sup>6</sup> 2P-FCM imaging system. Pulses from a Ti:Sapphire laser source are spectrally broadened through polarization maintaining fiber (PMF) and chirped using a grating-pair pulse stretcher. Output pulses are focused and scanned onto the surface of the CIFB through scanning mirrors, scan/tube lens relay, and 10x Objective. Fluorescence emission is collected by the CIFB and directed to a photon counting PMT through a dichroic filter. The collected signal is amplified and transformed to logic levels to be detected by the DAQ counter

collected back through the fiber cores is detected by photodetectors housed in the 2P-LSM after spectral filtering. The individual components of the system are discussed in detail below.

3.1.1 Laser-Source The excitation source is a Spectra-Physics MaiTai HP Ti:Sapphire pulsed laser, with ~80-fs duration pulses tuned to a center wavelength of 910 nm and operating at 80-MHz repetition rate. The beam power is controlled by a half-wave plate on a rotation mount (Newport Conex-AG-PR100P) followed by a Glan-Taylor polarizer (Thorlabs GT10-B).

3.1.2 Spectral and Temporal Pulse Precompensation In order to obtain the shortest pulses at the sample, the laser is first sent through a pre-compensation setup optimized for propagating through a 1.0 m length CIFB (FIGH-15-600 N, Fujikura). The output from the laser is focused into a 0.5 m length polarization maintaining (PM) fiber (PM780-HP, Thorlabs) that causes spectral broadening of the pulse. The output of the PMF is collimated with a fixed fiber-collimation lens (F220APC-780, Thorlabs). The 1.0 mm beam is then expanded through a 3.75 beam expander to reduce the irradiance on the gratings. The beam is reflected off of the two gratings at near Littrow angle separated by 265 mm. It is double passed through the grating pair using a retroreflector with the beam exiting at a different height. The gratings are reflective ruled gold with a density of 300 grooves/mm (49-572, Edmund Optics). The grating-pair stretcher applies ~66,000 fs<sup>2</sup> of negative GDD to the laser pulse to compensate for the positive dispersion from the 0.5 m length PMF, 1.0 m length CIFB, and additional non-linear dispersion at the power levels used in the experiments. The precise grating separation distance was determined empirically by maximizing the fluorescence signal while imaging a fluorescent test slide (Chroma Technology) through the 2P-FCM. After traveling through the grating pair, the beam size is reduced by 2.5 with a reverse beam expander to optimize the beam diameter for coupling into the CIFB.

3.1.3 2P-LSM Bench-Top System The beam is scanned through a galvanometric mirror scanning system (6215H, Cambridge Technologies), relayed through a 50 mm FL scan lens and 180 mm FL tube lens in an Olympus IX71 microscope. The beam is focused and laterally scanned on the surface of the CIFB using a 10/0.4 NA Olympus UPLANSApo objective lens. A XYZ-translation stage (CXYZ05, Thorlabs) is used to accurately align the fiber to the focus of the objective lens. The beam propagates through the CIFB and then through the distal miniature optics and focused onto the sample. The fluorescence emission is collected by the same optics, back through the CIFB. Because the miniature optics are not chromatically corrected, green emission light is projected onto a ~50 μm diameter area on the CIFB surface. Power transmission through the CIFB was measured using a 532 nm light source and was found to have a ~ 67% efficiency through the CIFB when collected through multiple fibercores. The 10 objective collects the fluorescence emission from the CIFB and is separated from the excitation laser path by a primary dichroic mirror (T670LPXR, Chroma Technology). A second dichroic mirror (FF562-Di02, Semrock) splits the red and green fluorescence that is focused using an achromatic doublet lens (LB1761-A, Thorlabs) on separate large-area photon counting photo-multiplier tubes (PMT) (H7422P-40, Hamamatsu). The output pulses from the PMTs are amplified by a high-bandwidth amplifier (ACA-4-35db, Becker & Hickl GmbH) and converted to logic-level pulses by a timing discriminator (6915, Phillips Scientific). The pulses are counted by a data-acquisition card (PCIe-6259, National Instruments) at a rate of 20 MHz. The counts are sampled and binned by pixels and converted into an image in custom software in Labview (National Instruments) that also controls the EWTL driver.

3.1.4 2P-FCM Miniature Optical System Design The imaging system for the 2P-FCM was designed in Zemax optical design software. Models for the stock lenses and for the EWTL were obtained from the manufacturers (Edmund Optics, Thorlabs, and Varioptic). The CIFB (Fujikura Ltd. FIGH-15- 600 N) has an outer diameter of 700 μm, an active image diameter of 550 μm, and a length of 1.0 m. There are ~15,000 cores with core-to-core spacing of 4.5 μm and core diameter of 3.2 μm, as previously reported by Chen et al. measured with scanning electron microscopy [55].

> The miniature optics contained in the head-mounted 2P-FCM imaging system are shown in Fig. 7. The fiber-coupling lens is an asphere (FL: 6.2-mm, diameter: 4.7-mm, Edmund Optics 83-710) to collimate the light diverging from the fiber bundle. The electrowetting lens is placed in the collimated beam and the light is refocused onto the sample by an objective lens consisting of a plano-convex lens (FL: 7.5-mm, diameter: 3.0-mm, Edmund Optics 49-177), and an aspheric lens (FL: 2.0 mm, diameter: 3.0 mm, Thorlabs 355151-B). The nominal magnification of this imaging system is 0.4 and field-of-view (FOV) is ~220-μm, corresponding to the de-magnified CIFB active imaging diameter. Similarly, the lateral sampling resolution is the de-magnified core spacing, which is ~1.8 μm.

> A commercially available EWTL (Varioptic Arctic 316) is used to control the axial focusing of the 2P-FCM imaging system. The predicted working distance and other imaging properties from Zemax at these three different voltage settings are summarized in Table 1. The optical power range of the EWTL is specified as 16 to +36 diopters. The optical system is optimized through the full focal range of the EWTL to minimize magnification change, maximize the axial scan range, and maximize the working distance of the 2P-FCM.

Fig. 7 Optics of the 2P-FCM miniature microscope head that focuses excitation light from CIFB cores onto the tissue. The CIFB-coupling asphere collects the light from the cores of the CIFB, which are then passed through the aperture of the EWTL. The plano-convex lens and the objective asphere focus the light onto the tissue through a #1 coverglass with 0.15-mm thickness

#### Table 1 2P-FCM optical parameters from Zemax through a range of focal lengths


The 2P-FCM has low excitation NA of 0.45; however, the large effective area CIFB results in a higher collection NA of ~0.6. This is illustrated in Fig. 8 showing the Zemax model for 910-nm forward excitation and 532-nm backward emission.

3.1.5 3D-Printed Miniature Head-Mount Design The enclosure for the 2P-FCM optics is designed in Solidworks 3D CAD software (Dassault Systemes). The packaging is split into three sections: top, bottom, and baseplate, shown in Fig. 9. The top-section contains the CIFB ferrule, held in place by two set-screws, and the fiber-collimating asphere. The bottom-section contains the objective lenses. The unmounted lenses are held in by friction in precisely sized openings. The top section has two curved tabs that interface with slots in the bottom section, which help to ensure reproducible alignment. The EWTL and the electrode are sandwiched between the bottom section and the top section with an O-ring that ensures good electrical contact. The flat-flex

Fig. 8 Numerical aperture comparison of forward excitation light at 910 nm (top) and backward emission light at 532 nm (bottom). Note that the blue lines indicate the optical rays entering the system on axis, while the green and red are off-axis rays. Largest emission and excitation field positions are matched at 220-μm FOV. Note that the backward emission is defocused on the fiber bundle end so that multiple fibers are used collect the fluorescent emission

Fig. <sup>9</sup> 3D-printed enclosure. (a) A two-part 2P-FCM snaps together to secure the EWTL and electrode. (b) Photo of 3D-printed parts, top: before any processing with supports still attached and bottom: assembled 2P-FCM

electrode cable exits the enclosure through a small slot between the sections. The top-section tabs have a single thread at the end, which interfaces with the baseplate as shown in Fig. 9a. In this way, the baseplate is pulled up against the bottom section by the top section. This greatly improves rigidity when attached to a moving animal. The baseplate is designed with ridges and holes to improve the adhesion of the cement for attachment to the animal skull. The entire enclosure is 3D-printed using a high-resolution projectionbased resin printer (Kudo3D Titan 1), with a resolution of 50 μm. The material used is a photo-curable resin (3DM-XGreen) dyed with 0.5% molybdenum disulfide to decrease light scattering and thus increase feature resolution. A photo of the top and bottom sections immediately after printing is shown in Fig. 9b. 3D printing allows optimization of the prototype and easily enables design changes, such as the inclusion of GRIN lenses for deep-brain imaging.

3.2 Test Sample Preparation Resolution and axial scan range measurements were performed by imaging fluorescent beads embedded in agarose (Sigma-Aldrich A9414) and a USAF 1951 resolution target (Edmund Optics 38-257). 2-μm yellow-green fluorescent beads (Invitrogen F8853) were used to measure axial scanning extent as well as lateral and axial resolution.

> Low-melting-point agarose was prepared at a concentration of 0.5% in water. The 2-μm diameter fluorescent yellow-green beads were diluted in the agarose to a concentration of ~2.0 107 beads/ mL. Approximately 2.0 mL of solution was placed on a #1 coverglass and allowed to set at room temperature. The beads were imaged in sequential axial planes by a 20x 0.75 NA Olympus UPLANSApo objective with a motorized stage and separately by the 2P-FCM by changing the voltage applied to the EWTL.

3.3 Mouse Imaging Setup All experiments were approved and conducted in accordance with the Institutional Animal Care and Use Committee of the University of Colorado Anschutz Medical Campus. Three-month-old male C57BL/6 mice (neocortex recordings, Jackson Laboratories stock No 000664) or Nst1-Cre (piriform recordings, MMRRC stock No 030648-UCD) were anesthetized by intraperitoneal ketamine-xylazine injection. The skin above the target site was numbed by lidocaine injection and retracted to expose the skull. For cranial window recordings in neocortex, the mouse was injected with an adeno-associated virus driving the expression of GCaMP6s under the synapsin promoter (AAV5.Syn.GCaMP6s), similar to procedures in [78]. The coordinates of the injection targeted the hindlimb somatosensory cortex, 0.2 mm posterior to bregma and 1.5 mm lateral to the midline, at a depth of 300 μm [79]. The injection volume was 0.66 microliters delivered with a glass micro-pipette through a 0.5-mm hole drilled at the target site. For GRIN lens recordings in piriform cortex, the mouse was injected with 1 microliter of AAV5-hsyn-DIO-GCaMP6s into anterior piriform cortex (AP:0.1 mm, ML:3 mm, DV:4.1 mm).

One month after AAV injection, the mice used for neocortex recordings were implanted with an optical cranial window near the injection site, using standard techniques as previously described. Briefly, mice were anesthetized by isoflurane inhalation and the skin under the scalp was numbed by subcutaneous lidocaine injection. The skin above the skull was removed to expose the injection site and skull surface. A 2-mm square window of skull was removed immediately anterior to the injection hole to expose the dura mater. The opening was covered with a 2-mm square #1 coverglass and secured in place with cyanoacrylate glue. Dental acrylic cement (C&B-Metabond) was used to cover the skull surface. The presence of fluorescence signal was confirmed with standard 2P-LSM using a 20 1.0 NA Zeiss Plan-Apochromat water-immersion objective. For piriform recordings an Inscopix 1-mm diameter 9-mm length GRIN lens (Inscopix, 1050-004596) was implanted to the region targeted for AAV virus injection and a custom-made headplate was placed with dental acrylic cement (C&B-Metabond) 2 weeks after AAV injection.

The baseplate attachment procedure is similar to what has been described for other miniature head-attached microscopes. While the mouse was still anesthetized, the 2P-FCM was held and positioned above the window with a micromanipulator (Sutter MP-285) until fluorescence signal could be observed with widefield epifluorescence through the bundle. The target region was chosen with two-photon imaging and the 2P-FCM was positioned to the region with the baseplate attached. The baseplate was then secured to the existing acrylic with black acrylic cement (Lang Dental Jet Acrylic). After allowing to set for ~30 min, the 2P-FCM was removed, leaving the baseplate in place, and the mouse was allowed to recover.

The imaging setup is illustrated in Fig. 10. The mouse was lightly anesthetized with isoflurane inhalation. The baseplate was carefully gripped by thumb and forefinger and the 2P-FCM was inserted and secured with a quarter-turn. The EWTL electrode was connected to light-gauge wires that were draped, along with the CIFB, over a horizontal metal post above the behavior cage. The mouse was allowed to recover in the behavior cage for imaging. The cage was illuminated by red light to minimize coupling into the fluorescence detection path, and a camera (Logitech C615) was positioned above the cage to monitor behavior during imaging.

Fig. 10 Mouse attachment. (a) The CIFB is attached to a coupling objective on the proximal end and the 2P-FCM on the distal end. (b) 2P-FCM is attached to the permanent baseplate on the mouse with a quarterturn

3.4 Image Processing

The images from the 2P-FCM show a honeycomb pixelation pattern due to the packing of the cores of the CIFB. Several methods have been described to de-pixelate images from CIFBs [53, 80, 81]. The simplest methods involve low-pass filtering with either a blurring function [82] or masking the image in the frequency domain [83]. However, two-photon imaging through a CIFB has the additional complication of the non-uniformity of the fiber cores. Each core is assumed to have a unique sensitivity, due to the variability in diameter, shape, NA, and amount of cladding between adjacent cores. This manifests as discrete variations in image intensity across the FOV.

This was addressed by programmatically dividing out the sensitivity of each core and interpolating the core values to remove the pixelation pattern. A flat map of the full field CIFB fiber-cores was taken by imaging a fluorescent test slide (Chroma Technologies) with the 2P-FCM, with an example shown in Fig. 11a. The flatmap stores the centroid coordinates of the cores and their corresponding sensitivity. The processing was performed with custom software (Matlab, Mathworks). Each image to be analyzed was registered to the flat-map, which allowed the identification of the cores. An example of a raw pixelated image is shown in Fig. 11b. The relative sensitivity of each core was compensated by dividing by the flat map values. The honeycomb pattern was eliminated by using the nearest neighbor interpolation method [84]. A Savitsky-Golay filter was used to reduce the added single-pixel noise introduced by the core multiplication factor during flat normalization.

Fig. 11 Image processing of two-photon imaging through CIFB. (a) Flat-field map showing non-uniformity due to differences in sensitivity in individual cores in the CIFB. (b) Unprocessed image of cells in mouse cortex with fiber-pixelation. (c) Post-processed image after fiber-cores were corrected with flat-field mask and re-gridded into a typical square pattern. Grid lines added to emphasize pixels

During the interpolation, the fiber cores were registered to a square pixel grid for straightforward analysis. An example output frame is shown in Fig. 11c.

For the processing of temporal scans, each frame was processed with the same flat-map alignment so that the cores are static in the field. Once the honeycomb pattern and CIFB-induced intensity variation were removed, a clustering algorithm was used to identify regions of interest (ROIs) of high correlation [85]. Significant changes in cytosolic Ca2+ were identified as changes in fluorescence larger than three standard deviations above baseline within each ROI.

The lateral and axial resolution and the axial focusing range of the 2P-FCM can be tested by imaging 2-μm diameter yellow-green fluorescent micro-beads embedded in clear agarose. The lateral resolution is fundamentally limited by the average spacing of the fiber cores in the CIFB. With the core-to-core spacing of ~4.5 μm and 2P-FCM magnification factor of 0.4, the theoretical lateral sampling at the object is ~1.8 μm.

The beads are imaged with the 2P-FCM at sequential focal planes by tuning the EWTL focus in discrete steps to obtain a Z-stack. The images are processed to remove the fiber-pixelation pattern as described in the previous section. Figure 12 shows a comparison of processed images of the beads imaged with a 20 0.75 NA objective and the 2P-FCM. The average lateral and axial line profiles of 5 beads measured from different focus positions are fit to a Gaussian function. The axial bead size measured by the 20 objective is 4.5-μm FWHM and with the 2P-FCM it is 9.9-μm FWHM. The lateral bead size measured by the 20 objective is 1.7-μm FWHM (dotted grey line) and with the 2P-FCM it is 2.6-μ

3.5 Testing and Calibration of Resolution, Magnification, and Axial Scan Range

Fig. 12 Axial and lateral resolution tested by imaging fluorescent beads with a 20x Olympus objective (dashed lines) or with the 2P-FCM (solid lines). Reprinted from [92]

Fig. 13 Measurement of axial scan range. (a) Side-projection of ~2-μm diameter fluorescent beads suspended in clear agarose and imaged with a 20x 0.8 NA Olympus objective (green) using 910 nm excitation light and a motorized stage or with the 2P-FCM while varying the EWTL power (red). (b) Predicted scan range as the EWTL optical power is changed modeled in Zemax (grey line) and Z-positions of measured beads (black circles)

m. The lateral bead size is larger than the diffraction limit as it is limited by the fiber bundle spacing. With a bead size of ~2 μm, non-uniform sampling of the bead with multiple fiber cores causes a larger effective lateral profile. The axial profile measurements of both the 20x objective at 0.75 NA and the 2P-FCM at 0.45 NA are similar to what is expected from the diffraction-limited calculations.

In order to calibrate the electrowetting lens voltage, the sample is imaged by a 20 0.75 NA objective and 2P-FCM. Figure 13a shows side projections of beads overlaid in green and red, respectively. The same region of the agarose-bead sample was imaged in both cases, such that most of the same beads appear in both Z-stacks. This made it possible to compare directly their apparent size and axial location to determine the 2P-FCM scan range. The predicated axial focus plane through the EWTL focusing range from Zemax and actual bead positions are shown in Fig. 13b (gray line and black dots, respectively). The full focusing range did not span the range predicted (240-μm predicted vs. 180-μm measured), likely due to under-performance by the EWTL at the high end of the optical power range.

The optical magnification from the CIFB to the target was evaluated by imaging the group 6, element 2 square on the USAF 1951 resolution target with the 2P-FCM as shown in Fig. 14. The imaging diameter of the CIFB is 550 μm. The magnification at three different optical power settings for the EWTL is measured by comparing the scaled size of the resolution target through the CIFB with the actual size. The results are summarized in Table 2. The magnification is measured to be ~0.4, varying by less than 5% through the focusing range, agreeing closely with the predictions from the Zemax model.

Fig. <sup>14</sup> Measurements of magnification of 2P-FCM. Elements of a USAF 1951 resolution target were imaged through the CIFB at three different focus settings. The known size of the elements is indicated


#### Table 2 Measured magnification variation through focus

#### 4 Discussion


Fig. 15 3D imaging with 2P-FCM of fixed tissue with oligodendrocytes expressing eGFP. (a) 3D volume acquired by the 2P-FCM (220 <sup>μ</sup>m dia. Lateral 180 <sup>μ</sup>m axial) with over 200 cells in the image. (b) Processed image of a single slice in the stack after filtering to remove pixelation pattern. (Reprinted from [92])

spectra. A common example is GFP and tdTomato, which are both efficiently excited at ~900 nm with pulsed light, but have spectrally separate emission bands. Another feature of a fiber-coupled system is that the emission signal can be spectrally de-mixed after returning through the fiber-bundle. The same filters and detectors used in a benchtop multiphoton microscope for multi-channel imaging can be readily employed for 2P-FCM imaging.

Figure 17 shows two-color imaging with 900 nm two-photon excitation of tdTomato and GFP simultaneously with both a 20x Olympus objective and the 2P-FCM. The cells are GFP-expressing mature oligodendrocytes and tdTomato-expressing oligodendrocyte lineage cells and sparsely labeled astrocytes (Mobp-EGFP; Olig2-Cre; R26-lsl-tdTomato triple transgenic mice). The two experiments were performed using the same detectors and filters. This provides an example of how the 2P-FCM can act as a swappable objective lens for mouse imaging, without needing to change the filters or detection path to acquire images on different channels.

4.2 2P-FCM Imaging In Vivo For in vivo 2P-FCM imaging a 3D-printed baseplate is permanently attached to the mouse. During each imaging session, the 2P-FCM is attached to the baseplate with little force on the skull. A photo of a baseplate attached to a mouse with a cranial window is shown in Fig. 18a. Figure 18b shows a photo of a mouse with the 2P-FCM attached. The CIFB and electrode wire for the EWTL are draped passively over a horizontal post to reduce the weight on the mouse. When attached, the mouse is able to move around freely in a small area (about 12<sup>00</sup> square). The movement is somewhat restricted by the length of the CIFB (1.0 m) and torsional resistance of the

Fig. 16 Tilted-field imaging enabled by rapid focusing of the EWTL. (a) Maximum intensity projection of fixed mouse brain tissue expressing GCaMP6s in neurons, acquired by the 2P-FCM. Arrows indicate cell bodies retained in fields. (b) Side projection of the volume. The same cell bodies are indicated by the arrows. The planes for the horizontal and angled scan limits are indicated. (c) Tilted field images taken from selected planes at angles ( 30,0,30) degrees as indicated in (b). (Reprinted from [92])

> CIFB, which prevents the rotation of more than about 180. It was found that, after short acclimation (~30 min), mice could traverse the entire behavioral area and habituated to the restrictions. Future implementation may include a commutator with rotation encoder for realignment, which has been shown previously [86].

> Figure 19 shows a widefield epifluorescence image of the region for an implanted mouse through the 2P-FCM. The left image shows the background fluorescence and vasculature on the day of implantation. The right image was taken 17 days later showing that the baseplate is stable and only a slight lateral shift in alignment is observed.

Fig. 17 Multi-color imaging. (a) Maximum intensity projection of a region of brain tissue acquired with a 20x 0.75 NA objective. Yellow cells are oligodendrocytes (Green and Red), while red-only cells are astrocytes and oligodendrocytes. Two astrocytes are marked with arrows and are easily identified by the characteristic bushy morphology. (b) Same tissue imaged with the 2P-FCM, using the same detectors, filters, and excitation wavelength. Green and red cells are visible in the field, with likely astrocytes marked by arrows. (Modified from [92])

Fig. 18 Mouse 2P-FCM attachment photos. (a) Baseplate implanted on mouse with cranial window. (b) Mouse behaving with 2P-FCM attached

The expression level in the neurons virally transfected with GCaMP6s in the cortex was verified by histology shown in Fig. 20. Neuronal cell bodies are seen in layers 4/5, as well as layers 2/3 in lower density.

Fig. 19 Implant stability over 17 days imaged with widefield epi-fluorescence. Scale bar 50 <sup>μ</sup>m

Fig. 20 Histological coronal section of mouse injected with GCaMP6s virus, 12 weeks after injection, showing good expression in layers 2–5 of motor cortex

4.2.1 2P-FCM In Vivo Mouse Imaging Through Cranial Window

In vivo two-photon imaging of neuronal activity through the 2P-FCM was performed in an awake and mobile mouse expressing GCaMP6s in cortical neurons. The mouse was allowed to wander freely in a 7<sup>00</sup> by 11<sup>00</sup> plastic cage. The cage was filled with sawdust as well as food and various novel objects, such as raisins and tissue paper, to motivate the mouse to explore the environment while attached to the 2P-FCM. A 3D projection of a Z-stack taken by tuning the EWTL focus, showing the imaging volume was acquired in Fig. 21b, c. Bright cell bodies and processes were visible down to 160 μm, corresponding to cortical layers 2/3. Time-courses were acquired sequentially at three focal planes at z ¼ 50, 95, and 140 μm below the cranial window. At the deepest focal plane (z ¼ 140-μm), the active regions are round objects ~10–20 μm in diameter, which are likely cell bodies. At the middle depth of 95 μm, there is a mixture of processes and cell bodies, while closer to the brain surface, activity appears predominately from processes as shown in Fig. 21d. Figure 21f shows the time courses of the ΔF/ F signal for the five indicated regions of interest at the three different depths.

The mouse was recorded with a camera during the imaging session to correlate the imaging results to the motion of the mouse. Lateral motion artifacts were present during some of the recordings. A motion correction algorithm was used to correct for the laterally shifting field [87]. The results of the motion correction indicate a time-averaged motion artifact magnitude of <2 μm, but had peaks up to 10 μm in some cases. The intensity of bright cells that did not exhibit fluorescence activity did not vary with the motion artifacts, so we conclude that the extent of the motion in the axial-dimension is lower than the depth-of-focus for the 2P-FCM (<10 μm). Overall, these motion artifacts were similar to those reported in head-fixed 2PE imaging studies [88, 89] and widefield imaging, which benefits from a much larger depth of field [4]. The 2P-FCM did not become loose or dislodged during the imaging sessions, which lasted between 1 and 4 h.

The 2P-FCM can additionally be used for imaging in deep brain regions by coupling to a GRIN lens. The baseplate is attached such that the 2P-FCM is aligned at the center of the GRIN lens and with the focus of the 2P-FCM positioned at the image working distance of the GRIN lens. The 1:1 GRIN relay lens used did not change the magnification of the 2P-FCM image, although aberrations from imaging through the GRIN lens reduced the field–of-view (FOV) in comparison to the cranial window imaging. Additionally, actuating the EWTL changed the axial focal plane imaged by the 2P-FCM through the GRIN lens as shown in Fig. 22. As an example of behaviorally relevant imaging of neuronal activity in deep brain regions, the 2P-FCM was used to image activity in anterior piriform cortex in a freely moving female Ntsr1-cre mouse exploring a novel environment with male bedding. GCaMP6s was expressed virally using AAV\_CWSL.hSyn.DIO.Synaptophysin-GCaMP6s.P2A. mRuby. Imaging data was processed to remove the fiber bundle artifact using the custom Matlab software and then analyzed by manually selecting ROIs where ΔF/F transients exceeded 6 standard deviations threshold. Behavioral video was analyzed with the

4.2.2 2P-FCM In Vivo Mouse Imaging in Deep Brain Regions Through GRIN Lens

Fig. <sup>21</sup> Two-photon Ca2+ -imaging in an awake and freely moving mouse using the 2P-FCM. (a) Widefield camera image of the epifluorescence taken through the FCM showing vasculature in the FOV. A black rectangle indicates the acquisition region for the following images. (b) 3D projection of a Z-stack acquired by sequentially imaging and focusing through the tissue using the EWTL. (c) Side projection of the same Z-stack indicating the three depths at which time series were acquired: at 50 <sup>μ</sup>m, 95 <sup>μ</sup>m, and 140 <sup>μ</sup>m depth, recorded at frame rates of 2.5, 1.3, and 2.0 Hz, respectively. (d) Maximum intensity projections of image frames that coincide with Ca2 + transients, showing structural changes through the focal depths. Scale bar is 50 <sup>μ</sup>m. (e) Selected ROIs that contain fluorescence transients that exceed the 6 SD threshold. Time traces for five representative ROIs are selected for each depth. (f) Detailed time-courses of the <sup>Δ</sup>F/F signal for the five indicated regions. (g) All identified transients, aligned by the peak <sup>Δ</sup>F/F signal, shown in gray with averaged intensity signal in black. (Modified from [92])

Fig. <sup>22</sup> Behaviorally dependent responses from 2P-FCM recording in piriform cortex in a mouse implanted with a GRIN lens and GCaMP6s expressing neurons. (a) Still frame of video of mouse behavior environment containing familiar bedding and novel unfamiliar bedding during freely moving behavior with attached 2P-FCM. (b) Example maximum projections at two separate focal planes, shown with relative distance of focus from GRIN lens. (c) Distance between the female mouse's snout and the foreign male mouse's bedding versus time. Manual correction was made to marker placement when DeepLabCut misplaced or lost markers. (d) An example frame of the behavioral video labeled by DeepLabCut. The green marker tracks the mouse's body, the blue marker tracks the mouse's snout, and the red marker is placed on the male bedding. (e) Summary of activity, showing greater activity in some ROIs as the mouse approaches the novel odor. (f) Max projection showing six manually selected regions used in the analysis

220 Baris N. Ozbay et al. DeepLabCut [90] software package which provides markerless tracking of body parts. The mouse was allowed to explore the environment for 5 min while recording behavioral video and imaging neural activity with the 2P-FCM. Piriform cortex is the largest recipient of projections from olfactory bulb, and it is expected to show an increase in neural activity when the mouse smells novel odors. DeepLabCut was used to track the snout of the mouse during the image sequence. The measured distance of the mouse to the bedding was compared with activity for manually selected ROIs. Distinct increases in activity can be observed when the mouse is in close proximity to the novel male bedding.

#### 5 Materials

Pulse Pre-compensator


Objective Adapter to Hold CIFB


2P-FCM Fiber and Head Mount


Imaging Test Targets


#### 6 Notes

The following includes detailed notes on how to align the pre-compensation optics before the microscope and align the 2P-FCM for freely moving mouse imaging. Details including Zemax optical design, solidworks files, and software can be found at https://github.com/CUNeurophotonics/2PFCM.

One of the challenges is setting up the single-mode fiber and grating pair compressor in front of the microscope. In particular, aligning the beam through the single-mode fiber can be challenging. Make sure to terminate the fiber with an APC (angled physical contact) on both fiber ends. APC termination minimizes back reflection because the fiber face is cut at an angle as opposed to a flat surface which can reflect back along the same optical path and cause the ultrafast laser to stop modelocking. For input coupling into the single mode fiber, we use a fiber collimator (Thorlabs ZC618APC-B) that allows for the adjustment of beam diameter and divergence. Before the input coupler, set up two steering mirrors in x, y kinematic mounts to optimize for the beam input position and angle. It is ideal for the last mirror before the input coupler to be as close to the input coupler as possible so that it mostly controls the input angle while the first mirror controls the 6.1 Optimizing Alignment into Single Mode Fiber

Fig. 23 Method for alignment of single mode fiber. Visualization of red back-propagating beam from visual fault locator with forward propagating beam from femtosecond laser on IR card. Femtosecond laser is incident on back of IR card and seen as a green spot on the front side

position. To perform the initial alignment of the laser to the fiber input coupler, set the laser to 780 nm with 10–20 mW of power. Higher power will damage the fiber if misaligned and 780 nm is used to be able to see the laser.

One method to get alignment started is to use a visual fault locator connected to the fiber end face. One can then spatially overlap the incoming femtosecond laser beam with the back propagating light from the locator using the input alignment mirrors, shown in Fig. 23.

After performing the course alignment, remove the fault locator and replace with a power meter. Proceed to "walk-in" the laser with the mirrors: Turn the horizontal knob of the first mirror and use the other mirror's horizontal knob to maximize the power and repeat. If this keeps lowering the power from the initial amount you had, try turning the first knob in the opposite direction and repeat. After the horizontal alignment is optimized, do this for the vertical as well. If this does not work, you are likely in a side lobe of the Airy disk. In that case walk it out of the local peak by moving steadily in each direction. It should go down and then up in one of the directions. Then start the walk in again!

At the end of the alignment, the output power from the fiber should be ~70–80% of the power measured before the input collimator.

Fig. 24 Photo of grating pair compressor. Labels indicate (a) fiber output coupler on kinematic mount, (b) D-mirror, (c) retro-reflector roof mirror on kinematic mount and linear stage, (d) output mirror, (e) kinematic mount with iris for alignment, (f) first grating, and (g) second grating

Alignment steps for the grating pair compressor (shown in Fig. 24) are as follows:


6.2 Setup and Alignment of the Grating Pair Compressor


The FCM objective holder, used to align the proximal end of the CIFB to the microscope focus, is shown in Fig. 25.


6.3 Optimizing Imaging Through Coherent Fiber Bundle

Fig. 25 Diagram of FCM objective holder to mount the FCM to any two-photon laser scanning microscope. The holder screws into the objective turret on the microscope with the appropriate adapter. A linear stage (OC1-TZ) adjusts the focus of objective to the end face of the coherent imaging fiber bundle (CIFB). A cage mount with xy translation (OC1-LH-XY) adjusts the lateral position of the fiber to center it on the objective field-of-view


Optimize 2P-FCM imaging as detailed above. Then perform the following steps to install the FCM over the cranial window in an anesthetized mouse.

1. Hold the 2P-FCM attached to the baseplate perpendicular to the cranial window. Bring the 2P-FCM close to the cranial window and zero the Sutter micromanipulator. Focus on the GCaMP-labeled neurons.

6.4 Installing the Pedestal for the 2P-FCM on the Cranium Important During the cranial window surgery place the head plate as close to the cranium as possible. A thick metabond layer between the cranium and the metal plate will make it impossible to place the 2P-FCM in the position necessary to be able to focus at depth. If necessary, you can thin the bottom of the FCM baseplate.

2. Perform a z stack with the electrowetting lens and do a time course recording. When you have found the right location, mix adhesive and place small amounts around the baseplate to bond it with the cranium. Take care not to bond the baseplate to the 2P-FCM.

#### 7 Conclusions

We describe a head-mounted 2P-FCM that achieves 3D imaging of neural activity in a freely moving mouse. The imaging volume is 220-μm diameter by 190-μm depth. The device is optimized for resolving neuronal somata with a lateral resolution of 2.6 μm and axial resolution of 9 μm. The 2P-FCM is compact, is lightweight, and includes an electrowetting lens for active axial scanning. We demonstrate the use of this device for tilted plane imaging, multicolor imaging, and fast multiplane imaging. The 2P-FCM was demonstrated for the imaging of neuronal activity in different planes of cortical layer 2/3 of a freely moving mouse with minimal motion artifacts. The 2P-FCM differs from previous headmounted microscopes [4, 12, 46, 47, 86, 91] because it allows live-focusing for a range of ~200-μm depth suitable for imaging neurons in layer 2/3 of cortex.

The goal of this work has been to create miniature fibercoupled microscopes (FCMs) that have the capability to compete with head-fixed imaging setups for awake and behaving mouse brain imaging. There have been two key technologies that have driven the progress of this work. First, the coherent imaging fiberbundle (CIFB), which is a tool that has large adoption in the field of clinical endoscopy. CIFBs allow passive imaging that functions remarkably well for both single- and multi-photon fluorescence fiber-coupled imaging. Second, the electrowetting tunable lens (EWTL), because of its light-weight and mechanical simplicity, is an ideal tool for miniaturized microscope applications.

The design outlined here can be easily assembled and attached onto a standard benchtop multiphoton microscope, commonly available for many labs, making the technology inexpensive to implement. Future work can further this technology for multi-site brain recording or potentially combined with two-photon holography for imaging and photostimulation in the freely moving animal with 3D access.

#### Acknowledgments

We would like to thank Ethan Hughes for assistance with in vivo animal imaging work, Juliet Gopinath, Victor Bright, and Robert Cormack for helpful discussions. We acknowledge Nicole Arevalo for assistance with animal care, and Elizabeth Gould and Wendy Macklin for providing brain tissue from Plp-EGFP mice. Funding for this work was provided by the National Science Foundation DBI-1353757, CBET-1631912, and IIP-1602128 and the National Institutes of Health BRAIN Initiative NS099577.

#### References


imaged in freely moving animals. Proc Natl Acad Sci 106(46):19557–19562


optimized dispersion control by reflection grisms at 800 nm. Opt Express 20(23): 25624–25635


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Chapter 8

## Optogenetics and Light-Sheet Microscopy

## Laura Maddalena, Paolo Pozzi, Nicolo` G. Ceffa, Bas van der Hoeven, and Elizabeth C. Carroll

#### Abstract

Light-sheet microscopy is a powerful method for imaging small translucent samples in vivo, owing to its unique combination of fast imaging speeds, large field of view, and low phototoxicity. This chapter briefly reviews state-of-the-art technology for variations of light-sheet microscopy. We review recent examples of optogenetics in combination with light-sheet microscopy and discuss some current bottlenecks and horizons of light sheet in all-optical physiology. We describe how 3-dimensional optogenetics can be added to an home-built light-sheet microscope, including technical notes about choices in microscope configuration to consider depending on the time and length scales of interest.

Key words Computer-generated holography, Optogenetics, Adaptive optics, Light-sheet microscopy, Zebrafish

#### 1 Imaging Translucent Organisms

Larval fish, flies, and worms are popular model organisms in developmental biology [1] and, increasingly, in systems neuroscience [2–4]. Optical translucency make these organisms well-suited to visualize physiological functions using high-resolution fluorescence imaging with sub-cellular spatial resolution. Small size and their ability to thrive when immersed in water make it possible even to image embryonic development and behaviors in toto over hours or days [5].

Light-sheet microscopy, also known as Selective Plane Illumination Microscopy (SPIM), has emerged as the method of choice for imaging smaller organisms, offering a number of advantages over point-scanning microscopy in speed, accessible volume, and phototoxicity. The light-sheet revival over the last two decades is tightly associated with important milestones in live tissue imaging, including the iconic example of whole-brain imaging in larval zebrafish (Danio rerio). Launched by early examples of calcium imaging of fictive activity [6], several research groups worldwide now

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_8, © The Author(s) 2023

routinely record calcium activity from the nearly 105 neurons of the young awake, behaving zebrafish. The resulting avalanche of data is beginning to lead to new insights about the communication between different brain areas (see [3, 7] for recent reviews).

A natural extension to such imaging studies is the integration of optical methods for perturbation, such as optogenetics [8, 9], optopharmacology [10], and cell ablation [11, 12]. The expanding toolkit of molecular probes offers many optogenetic actuators to remotely activate or inactivate cellular processes. In the context of controlling neural activity, this progress in engineering molecular probes, together with the development of suitable optical methods [13], makes possible to photostimulate action potentials in cells expressing photosensitive channels and read out the affected neural activity using fluorescent reporters for calcium [14] or voltage [15]. Many of these tools have been developed into transgenic animal strains [16, 17]. Early demonstrations of optogenetic manipulations in combination with light-sheet microscopy include optogenetic control over a variety of physiological phenomena, especially in larval zebrafish, from the beating of the heart [18] to cellular control of reflexive behaviors [19].

In this chapter, we describe how optogenetics can be added to a home-built microscope inspired by the open-source project Open-Spim [20]. As an example, we describe in detail a microscope configuration suitable for cellular or sub-cellular optogenetics in larval zebrafish. The method involves adding two-photon photostimulation shaped by computer-generated holography (2P-CGH). The stimulation module exploits the high numerical aperture (NA) detection objective of the light-sheet microscope to simultaneously excite multifocal points targeted either to sub-cellular regions or to multiple somata. The light-sheet module provides flexibility to readout neural activity in tiny organisms from small volumes to whole brain. Volumetric imaging is achieved with an electrically tunable lens, allowing independent control of imaging depth without moving the detection objective and consequently the axial location of the stimulation foci.

We provide technical notes on optical alignment, alternative configurations for different applications, and limitations and challenges of combining optogenetics with light-sheet microscopy. Finally, we offer some perspectives on extending all-optical physiology to higher spatial resolution in vivo.

1.1 Light-Sheet Technologies The functioning principle of light-sheet microscopy is to illuminate the sample with a thin sheet of light while collecting the fluorescent signal at an angle (usually orthogonal) relative to the illuminated plane. The illuminated plane is aligned with the focal plane of the detection objective enabling an image to be collected by a camera, as in a widefield microscope. Whereas laser-scanning confocal microscopy achieves optical sectioning through rejection of photons generated outside of the excitation focus, light-sheet microscopy avoids generating out-of-focus fluorescence. This approach provides optical sectioning while minimizing photobleaching and photoxicity.

1.1.1 Light-Sheet Configurations Light-sheet microscopes can be implemented in a variety of configurations, distinguished by the position and number of microscope objectives, the sheet-forming illumination optics, and the detection optics. Readers are referred to several excellent review articles focused on developmental biology and high-resolution applications [21, 22]. With respect to brain imaging, variations of light-sheet microscopy have been driven by two main challenges:


Here, we briefly compare light-sheet configurations used for whole-brain imaging in zebrafish, as shown in Fig. 1.

#### Selective Plane Illumination

The basic SPIM design (Fig. 1a) uses two orthogonal microscope objectives for illumination (I) and detection (D) of fluorescence. The illumination light is typically shaped into a two-dimensional sheet with a cylindrical lens [23, 24]. To image a volume, the sample is either translated with respect to the detection objective, or the light sheet is scanned with a galvanometric mirror while also keeping the illuminated plane conjugated to the camera with either a piezo objective or an electrically tunable lens.

Later it was demonstrated that rapid scanning of a pencil beam (a long, thin illumination profile in one dimension) could generate a "virtual" sheet, as seen by the camera, with the major advantage of reducing light exposure to the sample [25]. This approach is alternately called digitally scanned light-sheet microscopy (DSLM). In this case, volume acquisition requires a second scanning mirror.

 For (relatively) small, translucent samples, where light can enter the sample from any side, the SPIM design is convenient. Because two (or more) objective lenses are used, this design decouples the axial (△z) and lateral (△x, y) (△x, △y) resolutions that scale as the inverse of the numerical aperture for excitation objective, △z NAI -1 , and the detection objective, △x,△y NAD -1 .

The constraints on axial resolution limit the usable FOV. The usable length of a Gaussian light sheet is proportional to its thickness. For this reason, longer Gaussian profiles have poor optical sectioning and increased phototoxicity by illuminating a sample slice thicker than the detection depth of field. The most immediate improvement for obtaining longer 1-D uniform excitation profiles

n Fig. 1 Selection of light-sheet microscope configurations used for whole-brain imaging in larval zebrafish. (a) Fluorescence is collected along a sheet of light formed by side-ways illumination. (b) In variations of multiview light-sheet microscopy, additional objectives illuminate or collect fluorescence simultaneously allowing for either greater uniformity of illumination or multiplexing image formation from different angles. (c) I variations of single-objective LS, a tilted sheet is swept laterally across the sample while collecting a tilted epifluorescence image. (d) Selective volume illumination microscopy is a hybrid of light-sheet illumination and extended depth of field detection

comes as a trade-off with temporal resolution: tiling the excitation light sheet [26] allows, in principle, to select only the central uniform region of the excitation profile, stitching together multiple images over an arbitrary FOV. However, in all-optical physiology experiments, the decreased temporal resolution may be unacceptable.

An alternative way to obtain uniform illumination over a larger FOV is to illuminate the sample from two sides using an additional illumination objective opposite of the first (dual-sided light sheet). Depending on how the sample is mounted in the microscope, it may also be possible to illuminate from additional angles [27].

#### Multiview Illumination

In the case of multiple illumination sheets, each sheet needs to cover only half of the total FOV, so a lower NA sheet can be used to better preserve optical sectioning. For instance, the IsoView [28] light-sheet microscope employs two different DSLM geometries to simultaneously illuminate the sample from two opposite sides, collecting the view with two cameras (Fig. 1b). In this method, combining the overlapping images from multiple angles, it is possible to achieve isotropic spatial resolution [28]. This parallel excitation also represents a robust solution against sample opacity. Moreover, employing two sets of galvanometric mirrors in each illumination arm (one for scanning, one for correcting incidence angle on the sample) allows to employ online optimization algorithms [29] to partially correct for low-order sample-induced aberrations. These improvements come at a cost in terms of both hardware (the number of parts and alignment difficulty) and software complexity. Additionally, because every image is collected twice, the amount of data collected necessitates high-end storage capabilities and lengthy analysis pipelines in order to properly fuse the views into a final volume. An added benefit provided by this geometry is the sample that can be left stationary, while the scanning is performed by the galvanometric mirrors (to move the excitation profile in 3D) coupled with piezo motors that keep the detection objective focused on the illumination plane.

#### Swept Plane (Single Objective)

A single-objective light-sheet configuration, employing epifluorescence, can be obtained in several ways [30, 31] by generating a tilted elongated focus (Fig. 1c). The tilted sheet is swept laterally across the sample to image a volume. Swept plane approaches are gaining ground in neuroimaging because they facilitate high volume speeds, as reviewed recently by Hillman [32]. The singleobjective geometry has the advantage of using the same sample preparation as confocal or two-photon microscopy and, uniquely, can also be extended to samples of arbitrary size [33]. While the swept plane approach partially sacrifices resolution because every image is formed collecting planes from regions far from the optimal focus, re-imaging the tilted plane (and some post-processing) can recover diffraction-limited resolution. Higher NA objectives with short working distances can also be used when applying light-sheet microscopy to small samples (e.g., single cells) [34, 35].

#### Hybrid Light-Sheet Microscopes

A clever approach to improve imaging speed is to increase out-offocus contributions in a principled way. Manipulating the detection point-spread function, for instance by adding spherical aberration [36] or a cubic phase profile [37], extends the effective depth of field of the detection objective so that information can be harvested from a thicker illuminated volume. Other approaches for simultaneous volume acquisition borrow from light-field microscopy. For instance, an exciting direction of hybrid imaging called selective volume illumination (SVIM) [38] merges light-sheet excitation with light-field microscopy techniques to allow extremely fast (tens of Hz) volumetric imaging. The trade-off for resolution is acceptable for somatic imaging [39], and SVIM significantly improves contrast over light-field microscopy with widefield illumination.

1.1.2 Engineering Illumination Improves Resolution and Photodamage

Light-sheet microscopy is intrinsically efficient with photons. The local intensity required for light-sheet imaging is smaller than that for confocal techniques [40], including spinning disk confocal microscopy. In fact, with scanning light sheet, the total energy deposited at each point of a 3-dimensional (3D) sample is reduced by a factor equal to the total number of sections obtained during the imaging [41]. This minimizes photodamage to the specimen and also has a positive effect on the imaging speed. Moreover, detectors used in light sheet, as CCDs or sCMOS cameras provide a better dynamical range rather than single-pixel detectors used in point-scanning approaches (e.g., avalanche photodiodes or photomultiplier tubes). A poor dynamic range causes problems of detector saturation that translates to trade-offs in smaller volume or slower acquisition time. Light-sheet illumination is less prone to excitation saturation compared to point-scanning techniques, so it is not necessary to compromise imaging speed with long frame exposure times.

To further improve on minimizing illumination intensity and photodamage, engineering the excitation beam to produce quasinon-diffracting beams, in particular, Bessel beams [42, 43], has strongly impacted the field. Bessel-like beams preserve a smallbeam waist over a longer distance compared to beams with a Gaussian amplitude profile (Table 1), translating to more uniform illumination over the FOV of the detection objective.

Moreover, when propagating through inhomogeneous samples, Bessel-like beams have reduced scattering and beam spreading due to their self-healing property; namely the beam recovers the initial intensity profile after an obstacle. However, Bessel beams have a major downside: a large portion of energy resides in side lobes, which can spread out for a tens of microns beyond the central peak. In fact, they may generate fluorescence signal from out-offocus planes, preventing the theoretical gain in optical sectioning

#### Table 1

Beam properties. Bessel beam characteristics depend on the geometry of an annular mask, placed in a plane conjugated with the back aperture of the excitation objective. Parameters: e = pixel size; NA = objective numerical aperture; M = magnification; n = refractive index; λ = wavelength; w = annulus width; J<sup>1</sup> = Bessel function of the first kind; α, α′ = constants proportional to outer and inner annulus radius, respectively


while increasing phototoxicity. For this reason, researchers have introduced methods to reduce the contribution of the side lobes to the image:


1.2 Design Choices Optimal choice of the light-sheet microscope configuration depends on the research questions of interest, particularly with respect to the spatial and temporal resolutions required. Small organisms tend to have smaller cells than mammalian tissues, so resolution is of particular concern in both imaging and photostimulation. The choice of light-sheet approach is often a matter of 1.2.1 Prioritize Scale, Resolution, or Speed

choosing the best trade-off between speed and resolution, given the dimensions and transparency of the sample.

In designing the system described below, we considered applications involving the larval zebrafish. For this sample, the SPIM design is convenient. The larval zebrafish brain occupies a volume of approximately 500×800×300 μm, and neuronal somata are typically 5–10 μm in diameter. The whole brain can be measured in a single FOV of a 10x detection objective, with which cellular resolution is easily achieved. However, we also wanted the flexibility to image sub-cellular resolution in smaller brain regions, so we have used higher magnification to achieve sampling of better than 0.2 μm per pixel on the camera. This lateral resolution is achieved with a high-NA water-dipping detection objective (practically limited to NA< 1.1 by commercial objectives). To achieve sub-cellular axial resolution over most of the brain, we chose to apply the superior optical sectioning of a Bessel beam.

For example, considering illumination with a laser source with λ = 488 nm, and an excitation objective with NA= 0.29, a Bessel profile can be generated to cover uniformly 160 μm with a central peak width of ≃ 0.6 μm. To reach the same FOV, a Gaussian beam would have more than double thickness (around 10 μm, as calculated following precisely the Rayleigh length formula): of course, this value is chosen following a trade-off between length and the acceptable divergence that can be tolerated at the edges of the FOV.

The experimenter has many options to add photostimulation optics to a light-sheet microscope, including the variety of methods discussed in Chapters 1, 3–5, and 11 of this book. Both scanning and parallel approaches to photostimulation can be applied to small, translucent samples. In the first case, resonant scanners or galvanometric mirrors steer a focused beam across multiple regions of interest (ROIs), whereas, in the latter case, all the ROIs are illuminated simultaneously by using computer-generated holograms (CGHs) projected through spatial light modulators (SLMs).

The same excitation strategies applied in living animals require increased optical sectioning and penetration depth, both provided by two-photon (2P) illumination. For example, Dal Maschio et al. [48] have integrated a 2P-CGH module with a two-photon-scanning microscope, generating an instrument capable of identifying behavior-related neural circuits in living zebrafish larvae. The stimulation is targeted to single soma with a diameter of 6 μm and an axial resolution of 9 μm over a volume of 160×80×32 μm. Comparable lateral and axial resolutions for circuit optogenetics are achieved by McRaven and colleagues [49], in their 2P-CGH setup coupled to a 2P scanning microscope with remote focusing, to discover cellular-level motifs in awake zebrafish embryos. On the other hand, De Medeiros et al. [12] have combined a scanning unit with a multiview light-sheet microscope. This is a flexible

1.2.2 Type of Photostimulation instrument to perform ablation of single cells in zebrafish embryos and also localized optogenetic manipulations with concurrent in toto imaging in Drosophila. Here, the effect of the optogenetic manipulation can be monitored at the embryo scale with cellular resolution. Another example of SPIM integrated with 2P scanning stimulation is presented in [27], where whole-brain imaging and brain-wide manipulations in larval zebrafish reveal causal interaction. All these works demonstrated that in the small brain of the larval fish ( 0.1 <sup>μ</sup>m3 ), it is possible to both record and stimulate with millisecond temporal resolution and single-cell precision over the full volume of the engaged neural circuit.

1.2.3 One Photon or Two? As mentioned in other chapters, multiphoton excitation is the simultaneous absorption of n lower-energy photons to electronically excite a higher-energy single-photon transition. For visibleabsorbing optogenetics chromophores, near infrared (NIR) wavelengths (700–1100 nm) are typical for two-photon absorption. NIR has the added advantage of high penetration depth in biological tissues [50]. Transparent tissue might seem to obviate the need for multiphoton microscopy. On the contrary, there are several arguments for the use of two-photon excitation light-sheet microscopy.

On the imaging side, scanned beam approaches also made two-photon excitations feasible because the spherical focus can generate the highest peak intensity for a given power, extending light-sheet microscopy to imaging in highly scattering samples [51, 52]. Even for optically translucent samples such as larval zebrafish, scattering is noticeably reduced. For brain imaging in larval zebrafish, two-photon microscopy is often preferred because it is more orthogonal to the visual system [2]. Imaging with visible light impacts general brain activity, visual sensitivity, and even innate motor behaviors due to non-visual opsins [53], though by carefully avoiding direct illumination of the eyes, it is possible to deliver visual stimuli and even virtual reality [27].

For photostimulation with cellular, or sub-cellular, precision in 3D tissues, it is crucial to exploit multiphoton absorption for optical sectioning. Since 2P absorption is a non-linear process, its probability depends quadratically on the intensity of the excitation light. The main consequence is an improved optical sectioning because the stimulation is generated only in the vicinity of the geometrical focus where the light intensity is the highest [54]. The resolution of the multiphoton excited fluorescence is described as the full-width half maximum (FWHM) of the threedimensional point-spread function (3D-PSF) of the fluorescence intensity h<sup>i</sup> of the excitation. The following equations describe the dependency of the 3D-PSF hi on the 3D-PSF of the illumination intensity of two-photon excitation.

$$h^{2P}(\mathfrak{u},\mathfrak{v}) \propto |I(\mathfrak{u},\mathfrak{v})|^2,\tag{1}$$

where u and v are, respectively, the axial and lateral coordinates of the optical system.

It should be noted that all-optical experiments require careful control to avoid heating effects induced by NIR stimulation [55]. Compared to experiments in rodents or organotypic tissues, small translucent organisms such as zebrafish larvae are fragile and easily burned. Their small size, typically smaller than the beam waist exiting the microscope objective, means that a significant fraction of the animal's skin is exposed to defocused light, even while a tight, diffraction-limited focus may be achieved beneath the skin. When exposed to NIR light, even small amounts of dark skin pigment can lead to unintentional burning. Furthermore, ectotherms such as zebrafish and Drosophila lack the ability to maintain a constant internal body temperature and typically regulate temperature behaviorally (e.g., by heat-seeking or heat avoidance behavior [3]). Even a 1–2 K rise in temperature may have a notable effect on the physiology under study. In two-photon imaging, intensity is typically kept around 0.1–0.5 mW/μm2 [56]. In both 2-photon imaging and photostimulation, a potential control experiment to evaluate the 1-photon NIR effect is to test photostimulation with a lower peak power density, e.g., in mode-locked-Ti:Sapphire lasers, switching into continuous-wave mode provides a means to obtain the same average power with orders of magnitude lower than energy density.

#### 2 Materials

All-optical physiological experiments require an imaging system that combines a fluorescence microscope with a light path for photostimulation. The microscope serves dual purpose: first to acquire a baseline fluorescence image or movie to identify the spatial location of ROIs and second to acquire a continuous readout of the effect of photostimulation. Here we describe an example of a Bessel-beam light-sheet microscope and a 2P-CGH module.

2.1 Light-Sheet Module The microscope schematic shown in Fig. 2 is a digitally scanned light sheet implemented by scanning a Bessel beam in a 2D plane. The combination of an axicon with a plano-convex lens generates a beam shaped as a hollow cylinder. Since this beam is collimated, any subsequent conjugate plane can be chosen as the entry pupil of the microscope. The collimated ring is conjugated with two galvanometric mirrors (G1, G2, Thorlabs, ax1210-A) and the back aperture of the excitation objective (EO; Nikon, 10X/NA 0.3 CFI Plan Fluorite).

Fig. <sup>2</sup> Example configuration of light-sheet microscope with optogenetics (a) The system consists of a lightsheet imaging module (left, blue lines) and an optogenetics module (right, red lines) that share a common light path through the high-NA detection objective (DO). In the imaging module, a continuous-wave visible wavelength laser is shaped in a Bessel beam by an axicon (A) and scanned onto the sample through galvo mirrors G1 and G2 and excitation objective (EO). Fluorescence collected through the detection objective (DO) is transmitted through a low-pass dichroic mirror (DM) and imaged onto a sCMOS camera by an electrically tunable lens (ETL).The fluorescent signal is recorded with the confocal slit detection approach schematically shown in panel b. In the 2P-DH module, a Ti:Sa pulsed laser beam is magnified through a telescope (L1, L2), impinges on the SLM, placed in a plane where the wavefront is flat. Lenses (L3, L4) relay the wavefront from

Fluorescence is collected through the detection objective (DO; Olympus, 20X/NA 1.0 XLUMPLFLN), transmitted through a low-pass dichroic mirror, and finally, the image is formed by a 300 mm tube lens (L5). The image is relayed to the sCMOS camera (Andor, Zyla 4.2) by a 1:1 telescope (L6 and L7, focal 150 mm). An electrically tunable lens (ETL; Optotune, EL-16-40-TC-VIS-20D), positioned in the common focus of the telescope, can scan the signal at different depths.

To effectively eliminate the out-of-focus emission excited by the side lobes of the Bessel beam, the design implements confocal slit detection of the fluorescent signal, as shown in the inset of Fig. 2b. The slit is virtually created using an active window of the sCMOS camera that is rolled synchronously with scanning of the Bessel-beam illumination. In this way, we minimize the detection of fluorescence excited by the side lobes. The cost is a slower imaging speed.

The ETL in the detection path allows independent control of photostimulation in three dimensions and volumetric imaging, without moving either the objective lens or the sample. The accessible volume is determined by the choice of the tube lens L5. In fact, given the objective the focal of the tube, lens is directly proportional to the image magnification. For such a reason, longer focal lengths give access to a smaller volume, while they achieve better sampling of the fluorescence signal at the camera pixel. The ETL in the configuration shown in Fig. 2a allows to image a volume extending over 500 μm. The photostimulation can be performed on a volume extending over 200 μm in depth, with the SLM being the limiting factor in the theoretical axial FOV.

2.2 2P-CGH Module We implement computer-generated holography for spatial patterning of the photostimulation light. As shown in Fig. 2a, the ultrashort pulses of NIR light, emitted by the Ti:Sa laser (Coherent, Mira-900F), pass a combination of half-wave plate (HWP) and Glan–Thompson Polarizer (GTP) to control the average power. A mechanical shutter (S; Uniblitz, LS2S2Z1) allows to block or transmit the light. Then the beam, magnified through a telescope (L1 and L2, focal 25.4 mm and 150 mm), impinges on the SLM (Meadowlark optics, P1920-600-1300-HDMI) placed in a plane where the wavefront is flat (where the Gaussian laser beam is collimated). Subsequent lenses relay the wavefront from the SLM

Fig. <sup>2</sup> (continued) the SLM to the pupil of the DO. In the focal plane of lens L3, the inverse pinhole (IP) blocks the zero-order diffraction spot. (b) Left: Scheme of confocal slit detection. An active window (yellow line) on the sCMOS camera is rolled synchronously with the scanning of the Bessel beam illumination (cyan line). In this way, we minimize the detection of fluorescence excited by the side lobes of the Bessel beam (light cyan halo). Right: Bessel beam projected into uniform fluorescent solution. The zoom in on the middle region of FOV shows the central lobe of the Bessel beam as it appears with confocal slit detection

to the pupil of the DO: here, the Fourier lens L3 (focal 250 mm) and lens L4 (focal 500 mm) magnify the beam to fill the back aperture of the objective. In the focal plane of lens L3, an inverse pinhole (IP, diameter 1.4 mm) blocks the zero-order diffraction spot.

The light paths of the light-sheet microscope and the 2P-CGH module join at a dichroic mirror (DM; Semrock, FF01-720/SP-25), with a cutoff wavelength at 720 nm. This mirror reflects the NIR stimulation light to the sample and transmits the fluorescent readout to the sCMOS. Two-photon-excited (TPE) epifluorescence can be recorded, and this is useful to characterize the photostimulation beam as described in Subheading 3.1.

The choice of focal lengths of lenses L3 and L4 is important in the design of the experiment. If the SLM image at the objective back aperture is smaller than the aperture itself, the NA of the objective is not fully exploited. This might be an intentional choice. Otherwise, if the SLM image is larger, some stimulation light gets lost. The best option is to choose a telescope magnification that matches the dimension of the SLM image at the back aperture to the aperture itself. In this case, the lateral resolution is only limited by the objective NA.

The pixelated structure of the SLM chip introduces in the reconstructed hologram a zero-order diffraction spot that appears as a bright spot in the center next to the hologram. To separate the hologram from the zero-order illumination, we can either proceed algorithmically, or we can block the zero-order illumination physically. Since the result given by the second method can strongly degrade the quality of the image, it is preferable to implement a combination of the two approaches to preserve the image quality. The hologram can be displaced from the zero order algorithmically by introducing a constant defocus. Once CGH and zero order lay at different depths, the zero-order beam is blocked by means of an inverse pinhole without impairing the quality of the CGH.

In high-resolution applications, the spatial accuracy of photostimulation is paramount. As reported in [57], the precision to address a CGH to a specific target depends on the number of pixels in the SLM and gray scale values available. Several companies produce SLM with over 1000 pixels in the shorter axis, which provide the spatial accuracy required for sub-cellular manipulations and also a high number of degrees of freedom if the SLM is used as an adaptive element to correct for high-order aberrations. The accessible lateral and axial FOV for a CGH is inversely proportional to the SLM pixel size. However, fairly large pixel size (in the order of 10 μm) is preferred because, assuming a constant inter-pixel gap, the fill factor increases with the pixel size. A larger pixel size also reduces the cross-talk between pixels. Cross-talk acts as a low-pass filter on the CGH, and it is due to fringing field effect that causes gradual voltage changes across the border of neighboring pixels and


2.3 Sample Preparation In SPIM, samples are usually mounted in tubes made of fluorinated ethylene propylene (FEP), a plastic with refractive index similar to water. The tube is filled with a solution of water and low-meltingtemperature agarose (1.5–2%) for short-term imaging (1-3 hours). For larval zebrafish, these conditions ensure stability of the sample while maintaining good physiological conditions given the time frame of the experiment. However, as reported in [59] for longer experiments (over 1 day), it is recommended to use lower agarose percentage (0.1%) or methylcellulose solutions (3%) to ensure stability but also proper growth of the sample especially between 24–72 h post-fertilization.

#### 3 Methods

#### 3.1 Microscope Alignment

locations: 3.1.1 Align the Light-Sheet Module

The microscope is aligned by visualizing the Bessel-beam profile. By using the galvanometric mirror G1, responsible for the planar scan, we can image the full profile of the Bessel beam at specific


Fig. 3 Main steps of system alignment. (a) Example of good planar alignment. This image shows an overlay of the Bessel profile at different positions across the planar FOV. Inset shows a shallow Bessel beam with its side lobes. (b) Example of misaligned beam across the planar FOV. (c) Affine transformation between light-sheet and 2P-CGH module: (1) 3D plot of the coordinates fed into the algorithm to generate a point cloud CGH of random points spanning over a 3D volume. Those coordinates are defined in the coordinate system x′y′z′ of the 2P-CGH module. (2) 3D plot of the coordinates measured on the TPE image of the CGH. Those coordinates are defined in the coordinate system xyz of the light-sheet module. (3) Result of the affine transformation between 2P-CGH coordinates and light-sheet coordinates


Between Light-Sheet Module and 2P-CGH

Module

and 2P-CGH module share a common coordinate system. This is accomplished by first calibrating the scan volume of the LS and then calibrating a 3D hologram. Both volumetric scans are achieved by incrementing voltage on the ETL. Once the 2P-CGH and lightsheet volumes are calibrated with the ETL, the two coordinate systems are then aligned through an affine transformation (see Note 1). This calibration is carried out before each experiment, and the resulting affine transformation matrix is applied to the ROIs location selected on the image to get the input coordinates for the holographic stimulation:


3.2 Workflow of Light-Sheet Optogenetics Experiment The general workflow for all-optical physiology experiment using the system described above would be similar to other methods described in this volume. After alignment of the system, it is typical to first acquire a baseline fluorescence image or a 3D movie to characterize anatomical structure and possibly baseline activity. Then, the ROIs to be stimulated are selected. The criteria to choose the ROIs from the baseline image are set by the experimenter and fed into an algorithm for targeting light, e.g., CGH calculation. Subsequently, we program a pulse sequence encoding for the CGH where the stimulation light is temporally gated by an external shutter. Afterward, the fluorescence signal is recorded over time.

3.2.1 Imaging Larval Zebrafish After microscope alignment, mount a zebrafish in FEP tubing. Use brightfield illumination to orient and position the sample appropriately. Obtain a baseline fluorescence image. A high-resolution volume is recommended to help with brain registration later in image analysis. Based on anatomical or functional criteria, select regions of interest (ROIs) for photostimulation. Generate the hologram (s) and set a gating sequence or method to trigger the photostimulation sequence of interest.

3.2.2 Analysis of Large-Scale Ca2+ Data Set Figure 4 shows the analysis of a representative data set measuring spontaneous neural activity in a 5 days past fertilization (dpf) zebrafish larvae expressing nuclear localized GCaMP6s in neurons, as acquired by volumetric calcium imaging on the described Besselbeam light-sheet microscope. Fluorescence was captured in 20 volume sections of the forebrain, spaced 8 μm apart, with an acquisition rate of 0.67 Hz. Motion correction and the extraction of activity traces from individual neurons were implemented with the open-source calcium image analysis package CaImAn [60]. Motion artifacts were corrected through piecewise-rigid registration. Active neurons were then detected in the motion-corrected data through constrained non-negative matrix factorization (CNMF). To initialize CNMF, the images were first filtered with a Gaussian kernel. Thereafter, the Pearson correlation with neighboring pixels and the peak-to-noise ratio was calculated for each pixel (Fig. 4a). Local maxima in the pointwise product of the peak-to-noise ratio image and correlation image were used as initialization positions. With CNMF, the contours of found neurons (Fig. 4b) and activity traces (Fig. 4c) were then extracted. In total, 2077 active neurons were detected in the imaged brain volume (Fig. 4d,e). 3.3 Choice of CGH The 2P-CGH module shown in Fig. 2a is based on Fourier holog-

Algorithm raphy and requires an appropriate algorithm to calculate the desired phase hologram. The aim of CGH algorithms is to retrieve the phase mask, namely the phase of the hologram field Uh, to address the SLM by knowing the field Uo of the target at the image plane, where these fields are one and the Fourier transform of the other. To generate a hologram of ROIs distributed in 3D, we compute a 3D hologram corresponding to a field Uo defined at different depths z as schematically described in Fig. 5a. Optically, the Fourier transform of the field Uh is realized through the objective lens when the SLM is conjugated to the objective back aperture through a telescope (L3 and L4 in Fig. 2a).

> The choice of the algorithm to calculate CGHs depends on several considerations. Ideally, the algorithm provides high diffraction efficiency, uniformity over the volume of interest, and accuracy between the target and the reconstructed hologram. The algorithm should also be fast given the real-time nature of optogenetics experiments. The shape and dimensions of the target photostimulation pattern also influence the choice of CGH algorithm. When the target object has a complex and extended lateral shape (> 1 μm), image-based algorithms are employed. They enable the generation of extremely complex illumination patterns in very short times; however, they are limited to illumination light focused

Fig. 4 Analysis of light sheet calcium imaging data from the larval zebrafish forebrain with the open-source calcium image analysis package CaImAn. (a) Images were obtained from the forebrain of a larval zebrafish (5 dpf) expressing nuclear localized GCaMP6s. A total of 20 volume sections were imaged with Bessel beam light-sheet microscopy at an acquisition rate of 0.67 Hz. To detect neurons, CNMF was initialized by filtering the motion corrected timeseries data with a Gaussian kernel and calculating the peak-to-noise ratio and Pearson correlation with 4 nearest neighboring pixels. Local maxima in the point-wise product of the peak-tonoise ratio image and correlation image were then used as initialization positions. (b) Contours of neurons

on a limited number of two-dimensional planes [61]. Conversely, point-cloud algorithms allow one to target diffraction-limited spots arbitrarily distributed in the three-dimensional field of view of the optical system.

Many variations of these algorithms have been developed (Table 2). Here we classify them into two categories (see Note 2): classic algorithms, and more recent algorithms developed specifically for optogenetics applications. Random superposition and Gerchberg–Saxton [62, 63] are the most known algorithms. The first offers high speed but poor quality of the CGH in terms of uniformity and efficiency, whereas the second provides improved quality with an increased computational time. Recently, new algorithms have been developed that are optimized for speed, such as a compressive-sensing version of the Gerchberg–Saxton algorithm [64]. Here the compressive-sensing method allows to reduce the computational time by a factor of 10 without impairing the quality of the CGH achieved with the standard implementation of Gerchberg–Saxton. Another advanced algorithm recently available is computer-generated holography by non-convex optimization [61]. Arbitrary 3D holograms are generated through non-convex optimization of custom cost functions. Another advanced algorithm recently developed, called DeepCGH [65], is based on unsupervised convolutional neural networks. Holograms computed via DeepCGH showed improved computational times and high efficiency suitable for neurostimulation experiments.

As an example, in the high-NA configuration hereby described, we implement weighted Gerchberg–Saxton algorithm to generate stimulation ROIs with a lateral dimension between 6 and 10 μm, on the order of the size of small neurons in the larval zebrafish brain. The principle of this approach is schematically described in Fig. 5b, while Fig. 5d shows a 3D view of an extended CGH projected into a thick Rhodamine slide. On the other hand, when targeting sub-cellular regions, the compressive-sensing weighted Gerchberg–Saxton algorithm is faster, but the lateral extension of the CGH depends on the numerical aperture of the detection objective. In the case that NA = 1.0, the lateral extension of the spots is on the order of 1 μm. Similarly to the previous case, Fig. 5c illustrates the principle of point-cloud CGH, and panel 5e shows a 3D view of the point-cloud CGH. The compressivesensing weighted Gerchberg–Saxton algorithm provides comparable results in terms of spot brightness and uniformity as the point clouds generated by the GS and WGS approaches while reducing drastically the computational time [64].

Fig. <sup>4</sup> (continued) extracted with CNMF in a single section. (c) The fluorescence (△F/F) signals of 10 randomly selected neurons. (d) Maximum intensity projection of the measured volume. (e) Activity map of 2077 fluorescence traces from neurons found in all sections clustered by correlation coefficient

Fig. 5 CGH approach. (a) Optical transformation needed to project a 3D digital hologram. Uh represents the spatial Fourier transform of the desired pattern Uozi defined at different depths. f is the focal length of the lens (middle element). (b) Scheme of algorithms for extended CGH. Phase masks producing different features at focal planes I, II, III are combined by superposition principle at the SLM. Each of the three phase masks is calculated based on the corresponding image at the focal plane. (c) Scheme of algorithms for point-cloud CGH. Phase masks producing different features at focal planes I, II, III are combined by superposition principle at the SLM. Each of the three phase masks is calculated based on the coordinates xyz of the spots at the focal plane. (d) 3D view of image-based CGH of a grid 160 <sup>×</sup> 160 <sup>×</sup> 100 <sup>μ</sup>m. (e) 3D view of point-cloud CGH of the same grid

3.4 Effect of Aberrations on CGH All algorithms previously described compute CGHs based on the assumption that the sample has a uniform index of refraction. This assumption is rarely true in neuroscience since brain tissue is generally optically turbid and non-uniform. Optical inhomogeneities of samples cause a distortion of the stimulation pattern known as

#### Table 2



optical aberrations. In Fig. 6, we show the effect of optical aberrations on a point-cloud CGH. Specifically to the computed CGH phase, we summed a phase encoding for horizontal coma aberration with progressively increasing coefficient. The aberrated CGH has a decreased intensity, and it is displaced from its original position. Sample induced aberrations can be compensated through adaptive optics techniques using the SLM as an adaptive element.

#### 4 Summary

Optogenetic manipulation coupled to light-sheet imaging is a powerful tool to perturb and monitor living translucent samples such as zebrafish larvae. Light-sheet imaging can be realized in different flavors with emphasis on balancing optical sectioning with the dimensions of the FOV and maximizing the imaging speed to resolve fast dynamics as reported in Subheading 1.1. Here, the light sheet is realized by scanning a Bessel beam in a 2D plane. This illumination offers a more uniform illumination over the FOV compared to Gaussian beams and a reduced scattering thanks to the self-healing properties of Bessel beams. However, a major drawback is the presence of the side lobes of Bessel profiles. In our microscope design, the side lobes are electronically rejected by implementing a confocal slit detection of the fluorescent signal.

Fig. <sup>6</sup> Effect of coma on CGH: (a) TPE fluorescence induced onto a rhodamine slide by a one-point hologram with progressively increasing horizontal coma coefficient. Subscripts on the images indicate the corresponding peak-to-valley (PV) coefficient in units of the wavelength of horizontal coma. The additional coma phase introduces a dramatic drop in the fluorescence intensity compared to the case without aberrations (0 <sup>μ</sup>m PV coefficient). (b) Bar plot showing the maximum intensity of each aliased fluorescent spot in a. Intensities are normalized to the maximum intensity of the aberration-free spot (0 <sup>μ</sup>m PV coefficient). The maximum intensity is reduced below 50% compared to the intensity of the aberration-free spot. (c) Same fluorescence images showed in (a) where the aberrated spots intensity is enhanced by a constant factor. (d) Maximum intensity projection of the fluorescence images in (b) showing the position of each point across the camera chip. The aberration besides the loss of intensity introduces a loss of the CGH fidelity

Unfortunately, this solution sacrifices the imaging speed. To improve this aspect, an interesting future upgrade of our imaging system is the implementation of a NIR light source. In fact, two-photon excitation can successfully mitigate the fluorescence induced by the side lobes as reported in [44]. Moreover, NIR wavelengths are often preferred for brain imaging of zebrafish larvae due to their orthogonality to the visual system of the animal [2]. The photostimulation module exploits the high NA of the light-sheet detection objective to focus on the sample two-photon computer-generated holograms. Those offer flexibility to illuminate either sub-cellular ROIs with dimensions in the order of the light diffraction limit or whole cell bodies extending over several microns (≈ 6–10 μm). In the first case, optical aberrations due to tissue inhomogeneities are a limiting factor to reach diffractionlimited resolution. Hence, a next step in further developing the 2P-CGH technique is to include adaptive optics methods to compensate for tissue-induced aberrations using the SLM as corrective element. Another improvement could be to include temporal focusing in the 2P holography module [66] to increase the axial confinement of the stimulation light as also reported in Chapters 1, 6, 7.

#### 5 Notes

Note 1 Linear affine transforms will account for all linear transformations between coordinates, including lateral and axial shifts, magnification mismatches (independently for all axes), rigid rotations, and coordinate shearing. Non-linear distortions, such as barrel or pincushion distortion, would require a more complex non-linear transform. Given that the FOV is limited to hundreds of microns, non-linear effects rarely appear in a well-designed system, and a linear transform provides sufficient accuracy even for photostimulation of sub-micron structures. The linear affine transform matrix between homogeneous coordinates is represented by a 4x4 matrix [67]; however, it is determined by only 12 values as the last row in the matrix is always defined as {0, 0, 0, 1}. Hence, the matrix can be determined through least square minimization between vectors Xi of the image coordinates and Xi vectors of 12 SLM coordinates. The accuracy of the matrix estimation is improved using more than 12 coordinates see Subheading 3.1.3.

Note 2 Algorithms for CGH.

#### Problem Description

The mathematical approach to calculate a CGH differs based on whether we deal with image-based or point-cloud CGHs. To target complex and extended shapes spanning over few focal planes, the electric field of the desired pattern per each focal plane is modeled as

$$U\_o(\xi, \eta, z) = \varkappa\_o(\xi, \eta) e^{i\phi\_z},\tag{2}$$

where uo is the square root of the desired intensity at depth z and ϕ<sup>z</sup> is a random phase. The hologram field is defined as the fast Fourier transform (FFT) of the field Uo at each focal plane. Then the phase delay between the fields at different depths is optimized to get an interference pattern at the SLM plane as close as possible to the target illumination pattern. On the other hand for diffractionlimited spots arbitrarily distributed in the 3D FOV, the hologram field is defined as the wavefront generated by the coordinates x, y, z of the single spots in the focal planes. Hence, the field at the SLM is calculated as the superposition of the fields generating each single spot:

$$U\_b = \sum\_{n=1}^{N} \mathfrak{u}\_{o,n} \cdot \varepsilon^{-i(\frac{2\pi}{\lambda f}(\varkappa\_n \cdot \varkappa' + \jmath\_n \cdot \jmath') + \frac{\varepsilon\_n \pi}{\lambda^2 f^2}(\varkappa'^2 + \jmath'^2) + \phi\_n)},\tag{3}$$

where uo,<sup>n</sup> is the square root of the desired intensity of each spot at the focal plane, f is the focal length of the optical system, λ is the wavelength of the light source, xn, yn, zn are the coordinates of the n desired spots in 3D space, x′ and y′ are the coordinates of each SLM pixel, and ϕ<sup>n</sup> is a constant term. This field can be related to the field at the focal plane via a discrete Fourier transform (DFT) of the single spots. Also in this case, the aim is to optimize the phase delay between the fields generating each single spot.

#### Classic CGH Algorithms

#### • Random Superposition (RS)

The electric field of the desired pattern per each focal plane is modeled as in Eq. 2. The hologram is calculated as the inverse Fourier transform of the field Uo, and it is backpropagated to the SLM plane according to the Fresnel propagator in Fourier Optics [68]. Then the interference field of all the holograms is computed. The phase of this interference field is the hologram retained as the phase mask to load onto the SLM. The average of the hologram fields at different depths is based on the assumption of independence between the depths. Thus, this method does not take into account the interference of the fields at different planes. This results in a degradation of the reconstructed holograms, which is evident when we project holograms with more than 2-3 depths [61]. This method is very fast compared to iterative algorithms.

#### • Gerchberg–Saxton (GS)

The electric field Uo of the desired pattern is modeled as in the previous algorithm. Then the hologram is calculated as the inverse Fourier transform of the object field Uo. In the resulting hologram field, the phase is retained, while the amplitude is substituted with a Gaussian amplitude. This new field undergoes another Fourier transform to produce the object field at the focal plane. The pattern produced at the focal plane is similar to the desired one, but it has a decreased efficiency; hence, the phase is kept, while the amplitude is replaced with the one of the desired patterns. At this stage, the inverse Fourier transform of the field is calculated to retrieve the phase mask at the SLM plane. This procedure is repeated iteratively until the algorithm converges to phase mask that produces the best intensity pattern. This algorithm can be implemented for 3D holograms including in the calculation the propagation of the fields to different depth levels according to the diffraction theory [68].

#### • Weighted Gerchberg–Saxton algorithm (WGS)

This is a modified version of GS where all the features in the CGH have a better uniformity. For this purpose, an extra degree of freedom is introduced in the CGH calculation that is the intensity weight of every feature in the pattern. At each iteration, the intensity of each feature is optimized comparing the intensity at the last iteration and the intensity of the target. The computational cost remains the same of GS.

#### Recent Algorithms

#### • Compressive-Sensing Gerchberg–Saxton (CS-GS)

This algorithm is a faster version of the GS for sparse diffraction-limited spots in 3D. The field at the SLM is defined as reported in Eq. 3, and it is iteratively calculated for a subset of cM (where c is the compression factor and M is the number of SLM pixels) randomly distributed pixels of the SLM with GS algorithm. The last iteration is a standard GS calculation over all the M pixels in the SLM. The computational cost scales as Ni × Np × cM + Np × M, and this cost for cNi << 1 can be approximated to M × Np, where Ni is the number of iterations, Np is the number of points, and M is the number of SLM pixels. It has been demonstrated [64] that a speed up of over one order of magnitude can be achieved without compromising the quality of the obtained hologram compared to regular multi-spot GS.

#### • Compressive-Sensing Weighted Gerchberg–Saxton (CS-WGS)

This algorithm is a faster version of the WGS for sparse diffraction-limited spots in 3D. WGS is incompatible with the compressive-sensing approach leading quickly to divergence, but it is possible to run CS-GS for k - 1 iterations and then perform a final iteration with WGS. This method, known as CS-WGS, has an approximately double computational cost compared to CS-GS when c × Ni << 1. As reported in [69], both CS-WGS and CS-GS can be implemented with GPU calculations on a low-cost graphic card. This upgrade permits to calculate a CGH ten times faster compared to the standard version using the FFT algorithm.

#### • Non-convex Optimization for VOlumetric CGH (CGH-NOVO)

This iterative algorithm is to produce phase masks based on the non-convex minimization of a custom-defined cost function [61]. Given the intensity distribution of the target pattern V at different depths z, the corresponding hologram is calculated via the RS algorithm. A cost function is defined as

$$L(I( (\phi\_s), V)),\tag{4}$$

where I(ϕs) is the distribution generated by the random phase mask ϕ<sup>s</sup> at each plane and V is the intensity of the desired pattern at the focal plane. The minimization of the cost function in Eq. 4 gives the optimal phase ϕ . When I(ϕ) and the cost function have a well-defined derivative, the problem is solved with a gradient descendant approach.

CGH-NOVO offers the advantage of choosing a cost function tailored for the specific application. For instance, one might be interested in stimulating a subset of neurons, while minimizing the intensity received by off-target neurons below some illumination threshold, cost functions optimized for binary patterns are suitable. Another example of the flexibility of the CGH-NOVO is the possibility to implement a cost function optimized for 2P illumination. This algorithm can be implemented using a GPU to speed up the calculations.

#### • DeepCGH

The DeepCGH algorithm [65] uses convolutional neural networks (CNNs) with unsupervised learning to compute 3D holograms with the image-based approach. The distribution of amplitudes A(x, y, z) of the 3D target pattern at the focal planes is the input of the CNN. The CNN provides in output an estimate of the complex field P(x, y, z = 0) at depth z = 0. This field backpropagated to the SLM plane via 2D inverse Fourier transform gives the phase map Φ of the CGH. During unsupervised training of the CNN, the goal is to minimize a loss function <sup>L</sup>ðA,A~Þ, where <sup>A</sup> is the intensity distribution of the target and A~ is the intensity associated with the predicted SLM phase. As shown in [65], this algorithm offers a 10 times faster computational cost and up to 41% increased accuracy compared to iterative methods such as CGH-NOVO and GS. Moreover, experimentally, DeepCGH holograms can elicit two-photon absorption with 50% higher efficiency compared to GS or CGH-NOVO algorithms, given the same experimental conditions [65]. However, this algorithm requires a highperformance graphic card for GPU calculations.

#### Acknowledgements

This work was supported by funding from TU Delft and The Netherlands Organisation for Health Research and Development (ZonMw).

#### References


isotropic spatial resolution. Nature Methods 12:1171–1178


imaging data analysis. eLife 8. https://doi. org/10.7554/eLife.38173


sensing Gerchberg–Saxton algorithm. Methods and Protocols 2:2


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/),

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Widefield Multiphoton Imaging at Depth with Temporal Focusing

## Philip Wijesinghe and Kishan Dholakia

#### Abstract

Optical imaging has the potential to reveal high-resolution information with minimal photodamage. The recent renaissance of super-resolution, widefield, ultrafast, and computational imaging methods has broadened its horizons even further. However, a remaining grand challenge is imaging at depth over a widefield and with a high spatiotemporal resolution. This achievement would enable the observation of fast collective biological processes, particularly those underpinning neuroscience and developmental biology. Multiphoton imaging at depth, combining temporal focusing and single-pixel detection, is an emerging avenue to address this challenge. The novel physics and computational methods driving this approach offer great potential for future advances. This chapter articulates the theories of temporal focusing and single-pixel detection and details the specific approach of TempoRAl Focusing microscopy with single-pIXel detection (TRAFIX), with a particular focus on its current practical implementation and future prospects.

Key words Imaging at depth, Temporal focusing, Widefield imaging, Multiphoton microscopy, Single-pixel imaging, Compressive sensing

#### 1 Introduction

Optical imaging is expanding its boundaries with powerful emerging capacities for super-resolution of subcellular features [1, 2], widefield imaging across scales [3, 4], and achieving high temporal resolution of ultrafast biological processes [5, 6]. An important frontier in this endeavor is imaging at greater depths through scattering media, which is being addressed by multiphoton excitation and adaptive optics [7–10]. The prospect of recovering high-resolution information over large volumes of tissues is particularly attractive for neuroscience and developmental biology. However, the high spatiotemporal resolutions needed to observe many biological processes are challenging to presently achieve using point-by-point scanning, as in conventional fluorescence microscopy, which is often combined with aberration correction schemes [3]. Recently, a new strategy of refocusing ultrafast laser pulses in

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_9, © The Author(s) 2023

the time domain rather than solely in the spatial domain [11, 12] has revitalized the concept of widefield multiphoton imaging at depth [13–15].

The premise that one can focus light in time rather than in space has emerged rapidly over the past decades. Spatial focusing, the concentration of the intensity of a light field in space, is ubiquitous in virtually all-optical imaging systems. It is well-known that spatial frequencies can be focused in space via the Fourier transforming action of a lens [16]. Similarly, spectral frequencies can be focused in time, with much of the same equivalence in their Fourier transform properties, by introducing phase modulation and spatial dispersion [17]. This has formed the foundation of pulse compression and has led to chirped pulse amplification [18] that won Strickland and Mourou a share of the Nobel Prize in Physics in 2018. Defocusing or time stretching, on the other hand, has enabled the recording of ultrashort phenomena and spectral content in the time domain [19].

More recently, the simultaneous use of spatial and temporal focusing was presented in 2005 together by Oron et al. [11] and Zhu et al. [12], demonstrating an interesting phenomenon wherein a time-compressed pulse can be made to exist only at the focus of a lens. Away from the focus, the pulse broadens in space and in time, with a concomitant rapid reduction in its peak intensity. This restriction of the pulse to the focal region enables axial confinement of non-linear optical excitation in a scanless, widefield illumination scheme. This development, termed temporal focusing (TF), has had profound impact on multiphoton microscopy, where previously axial confinement could only be achieved by point scanning a highly focused beam across the image plane, which has severe limits on temporal resolution. The initial work was followed by a flurry of demonstrations of TF in widefield imaging [20–25], excitation [26, 27], harmonic generation [28, 29], super-resolution [30], micromachining [31–35], remote focusing [36, 37], tissue ablation [38], and trapping [39], among others. Unsurprisingly, the capacity for ultrafast widefield excitation has particularly flourished in optogenetics and neuroimaging [40–49]. This is because the absence of scanning has readily allowed for the simultaneous excitation and measurement of neuronal firing events on the millisecond timescale and over wide fields of view, previously unattainable by point-scanning approaches.

While a direct mathematical correspondence can be made between focusing in space and in time [19], it has become evident that TF behaves differently to spatial focusing in the presence of wavefront aberrations and through scattering media [50–55] due to the added angular diversity of the illumination spectra at the focus. For instance, the addition of TF has demonstrated a substantial improvement in propagation through scattering media and a reduction in speckle at the focus [51]. This discovery has enabled precise patterned multiphoton excitation at depth and has led to remarkable progress in optogenetics. This area is reviewed in [56] and in Chapter 10.

However, beyond this improved excitation, the capacity for imaging at depth has been impeded by tissue or sample scattering in the detection arm of the optical system. Specifically, widefield detection with a camera would observe severe spatial cross-talk from depths beyond one scattering mean-free-path length making it difficult to recover signals in a conventional manner. A major development came in 2018 with the addition of spatial demixing via single-pixel detection [13, 14], enabling widefield images to be recovered without the need for spatial coherence in the detected signal. This method, termed TempoRAl Focusing microscopy with single-pixel detection (TRAFIX) [13], works by decomposing the imaging process into a different domain, or coordinate space, in which the spatial information is carried by the scattering-robust illumination, rather than by the detection itself. A hybrid between widefield and single-pixel detection can also be realized [15, 57], trading off speed and the robustness to scattering. These methods have demonstrated a major reduction in photodamage compared to point scanning by spreading the excitation power both over a widefield and in the time domain [13]. Further, the imaging scheme is amenable to novel compressive-sensing techniques [14, 58], i.e., image reconstruction from a few sparse measurements, fundamentally reducing the needed excitation power (and thus photodamage), and imaging time. The capacity for precise all-optical excitation and multiplexed detection further offers great future potential for simultaneous volumetric recording of sparse functional signals, for instance, neuronal firing events.

Widefield multiphoton imaging at depth marries a striking combination of novel physics and a new computationally driven paradigm for multiphoton microscopy. In this chapter, we describe the theory and experimental realizations, with a particular focus on the method of single-pixel imaging with TRAFIX. Importantly, we also note many of the current challenges and prospective advances of these techniques.

#### 2 Methods

Achieving widefield multiphoton imaging at depth requires TF in illumination and spatial demixing, such as single-pixel recording, in the detection. We present, in turn, the theory of TF and single-pixel detection and the combined principle of TRAFIX. We further describe the addition of compressive-sensing and hybrid demixing strategies.

Fig. 1 Illustrations of the pulse shape in space and time in (a) spatial focusing and (b) temporal focusing. Temporal focusing realized using a (c) scattering plate, SP, and a (d) diffraction grating, DG. L: lens; Obj: objective; FP: common Fourier plane; IP: image plane. Adapted from [11]

#### 2.1 Temporal Focusing

Focusing of illumination pulses is the route by which we achieve axial sectioning in multiphoton imaging. Axial sectioning is required to record the signal precisely from the focal plane, with limited out-of-focus interference: all essential to form highly resolved three-dimensional images. Spatial and temporal focusing achieve this by different means. For instance, Fig. 1a shows a spatially focused Gaussian beam, where the drop-off in the axial intensity away from the focus is related to the Rayleigh range (zR), which is proportional to the square of the lateral beam waist. Since the probability of two- or three-photon excitation is related to the square and the cube of the field intensity, respectively, there is a strong confinement of fluorescence at the focus. Typically, a tight focal spot is needed to achieve sectioning at the micrometer scale. Figure 1b illustrates the concept of TF. Rather than confining the pulse intensity in space, the pulse width is broadened out of focus. Since the chance of multiphoton excitation is also inversely proportional to the pulse duration, axial sectioning is achieved over arbitrary spot sizes.

To describe the realization of TF in this regard, we first consider the formation of ultrashort laser pulses [12]. Ultrashort pulses are characterized by a broadband optical spectrum. The shortest possible pulse is formed when: (1) each spectral component is in phase (such that the pulse width is related to the Fourier transform of the envelope of its spectrum) and (2) the spectral components spatially overlap. Temporal dispersion, which can be described by a relative phase delay between different frequencies in the spectral domain or by a chirp in the time domain (i.e., a change in the instantaneous frequency with time), leads to a broadening of the pulse width. On the other hand, spatial dispersion leads to a separation of spectral components, limiting the available bandwidth in a local region, and thus the minimum pulse width. TF operates by forming an ideal, chirp-free pulse at the focal plane and by deliberately introducing spatial and temporal dispersion away from the focus, such that imaging can solely take place in the focal region.

The experimental realization of TF is well-described in Oron et al. [11]. Let us consider a thin scattering plate that is imaged by a perfect 4f system as illustrated in Fig. 1c. An ultrashort pulse incident onto the plate is scattered, and the individual rays that travel through the system are refocused onto the imaging plane. According to Fermat's principle, all rays that travel from one point of the scatterer (x1) and arrive at one point in the image (x2) have identical path lengths. As such, the pulse will be reconstructed at the image plane with no relative phase delay. However, at a point, P, away from the focus, rays arriving at a particular angle, θ, will have a path length difference related to <sup>z</sup>ðcos-1ðθ<sup>Þ</sup> - <sup>1</sup>Þ=c. The maximum phase delay due to the varied path length at P increases with θ, which is limited by the numerical aperture (NA) of the system and the distance from focus, z. In this scenario, the capacity for broadening the pulse out of focus is dependent on the ratio between the original pulse width and the path length difference introduced. Alternatively, an angled pulse wavefront can be incident onto the scattering plate maximizing the phase delays away from the focus [11].

Ultimately, a much more facile configuration can be achieved using a diffraction grating in place of a scatterer (Fig. 1d). The diffraction grating separates the frequencies of the incoming pulse at differing angles. These are then refocused by a 4f system. In the common Fourier plane, the signal may be described as a collection of laterally shifted monochromatic (single-frequency) beams. For an ideal Gaussian beam, the amplitude at the common Fourier plane can be described in the time domain as [12]:

$$A\_{1}(\varkappa,t) = \int\_{-\infty}^{\infty} e^{-\frac{(\varkappa - a\Delta o)^{2}}{r^{2}}} \cdot e^{-\frac{\Delta o^{2}}{\Omega}} \cdot e^{i\Delta ot} \cdot d\Delta o \,, \tag{1}$$

where s is the spatial width (1/e <sup>2</sup> radius) of each monochromatic beam and Ω is the spectral width. The first exponent represents a spatial Gaussian profile, laterally shifted by αΔω, where Δω is an offset in frequency and α is a proportionality constant set by the diffraction grating and the lens. The second exponent represents the amplitude scaling of each monochromatic beam, which follows a Gaussian spectrum of the original pulse. The last exponent is the phase shift relative to the group velocity.

The amplitude near the focus can be evaluated using Fresnel diffraction. This is done by performing a Fourier transform of Eq. 1, evaluated at spatial frequencies established by the focus, f. We assume that the wavenumber for each beam is approximately the central wavenumber of the pulse k0. The amplitude is given as [22]:

$$A\_2(\varkappa, z, t) = \kappa \cdot \varepsilon^{-\frac{x^2}{r\_2^2}} \cdot \varepsilon^{-\frac{\alpha^2}{4(1+\chi)}(t+\gamma \varkappa)^2},\tag{2}$$

where

κ = Ω ffiffiffiffiffiffiffi iπf zR r <sup>½</sup><sup>1</sup> <sup>þ</sup> <sup>i</sup> <sup>z</sup> zM - <sup>1</sup> 2 ,

,

$$\begin{aligned} \gamma &= \frac{k\_0 \alpha / f}{1 + i z / z\_M} \ , \\ & \qquad i \sigma / \sigma \end{aligned} $$

$$\begin{array}{rcl} \chi &=& \frac{iz/z\_B}{1+iz/z\_M} \end{array} ,$$
 
$$\begin{array}{rcl} \chi &=& \frac{iz/z\_B}{1+iz/z\_M} \end{array} ,$$

$$s\_2^2 \qquad \qquad = \frac{4f^2}{k\_0^2 s^2} + i\frac{2z}{k\_0}$$

$$\begin{array}{rcl} z\_M & = \frac{1}{s^2} \frac{2f^2}{k\_0}, & z\_R = \frac{1}{s^2 + a^2 \Omega^2} \frac{2f^2}{k\_0}, & \text{and} \quad z\_B = \frac{1}{a^2 \Omega^2} \frac{2f^2}{k\_0}. \end{array}$$

Notably, the spatial profile at the focus is equivalent to that defined by any one of the monochromatic beams, i.e., a Gaussian width of 2f/k0s. However, the modified Rayleigh range zR is dependent on both s and αΩ. Typically, αΩ ≪ s for widefield illumination, therefore, zR is defined by the extent to which the spectral dispersion fills the back aperture of the objective. Rayleigh-like coefficients zM and zB, related to the spatial and temporal distributions, respectively, additionally modify the phase evolution with time and away from the focus. Here, z is defined as the distance away from the focus (z = 0). The temporal evolution of intensity in the last exponent defines the pulse shape, and the width can be given as [22]

$$\pi(z) = \frac{2\sqrt{2\ln 2}}{\Omega} \cdot \left[1 + \frac{z\_M}{z\_B} \frac{z^2}{z^2 + z\_M z\_R}\right]^{\frac{1}{2}}.\tag{3}$$

The pulse is shortest at the focus, reaching the minimum transform-limited pulse width of 1/Ω (1/e <sup>2</sup> radius). The important conclusion of these equations is that, compared to a spatially focused Gaussian beam, TF decouples axial sectioning (zR) from the lateral beam shape (s2), which in turn allows precise control of the multiphoton excitation profile in 3D.

An alternative and more intuitive view of TF was offered by Durfee et al. [59] by examining the evolution of the phase delay (chirp) of each frequency with respect to the focus and lateral position. The formulation is detailed in Note 2. Figure 2a visualizes the evolution of the phase front of three selected frequencies of the

Fig. 2 Profile of a temporally focused pulse in the (a) spectral and (b) time domains. Lines in (a) represent the phase front delay of different spectral components. The profiles in (b) represent the pulse shapes and their relative delay with respect to lateral position, x. PFT: pulse front tilt

pulse in positions corresponding to fractions of the Rayleigh range (zR). At the focus, the wavefronts are flat; however, they are tilted with respect to each other. This represents a linear phase delay of each frequency with lateral position (Eq. 9). A purely linear phase delay in the spectrum results in a time shift in the arrival time of the pulse (via the Fourier shifting theorem). Figure 2b shows the timedomain version of this signal. Simply, the pulse will sweep across the focal plane, in a phenomenon termed as the pulse front tilt (PFT). Away from the focus, we can see increasing second-order dispersion (Eq. 10), corresponding to a rapid broadening of the pulse shape and the reduction in the peak magnitude. Interestingly, by introducing a group velocity dispersion of the second order to the original pulse, the focal plane of TF can be shifted within the linear region set by zR [21, 22, 60]. Using this method, TF can scan a 3D volume without physically scanning the focus or the sample. Further, this principle enables simultaneous TF excitation in 3D via holographic means [61–63]. These methods are reviewed by Ronzitti et al. [64] and may also be found in Chapters 1 and 7.

A recent discovery and a very important feature of TF are its ability to robustly propagate through scattering media [51]. This underlies its importance for widefield imaging at depth. Spatial focusing over a widefield constitutes weak focusing, or low

Fig. 3 Point-spread function of (a,b) temporally focused and (c,d) spatially focused beams with (a,c) no scattering and (b,d) though 900-μm of a scattering phantom. Scale bar is 20 <sup>μ</sup>m. Reproduced from [13]

> numerical aperture (NA) illumination. In this scenario, the field at the entrance pupil of the objective lens is tightly confined in space and is refocused, taking nearly parallel trajectories through the sample to the focal plane. The wavefront propagating through the scattering media is aberrated, leading to speckle due to multiple interference. With TF, an equivalent widefield area at the sample is illuminated; however, the spectral dispersion leads to a substantially broader field intensity at the entrance pupil, corresponding to an effectively high-NA illumination scheme. Each spectral beamlet takes diverse angular paths through the sample. This leads to a rearrangement in the speckle patterns of each beamlet and an effective speckle reduction at the focus [51]. Figure 3 shows experimental evidence of this phenomenon. Widefield illumination in Figs. 3a and c is robust to scattering with TF (Fig. 3b) and exhibits severe speckle without TF (Fig. 3d) [13].

> The aspect ratio of spatial dispersion with respect to beam size at the Fourier plane, β = αΩ/s, is a useful parameter in quantifying the transition between the temporal and spatial focusing regimes (see Note 2). A key consideration is that at high NA, where the beam width at the common Fourier plane exceeds the spectral dispersion, i.e., β →0, TF is equivalent to purely spatial focusing [59]. This can be validated by examining Eqs. 9 and 10 (see Note 2). In fact, from a purely theoretical perspective, at the focus, we can consider TF to be equivalent to a high-NA beam, scanned point-by-point across the same field of view. However, the PFT in TF sweeps the focal plane with a duration of τ<sup>0</sup> ffiffiffiffiffiffiffiffiffiffiffiffiffi <sup>1</sup> <sup>þ</sup> <sup>β</sup><sup>2</sup> <sup>p</sup> [22], which can be several picoseconds for typical multiphoton setups. This is not practically achievable by point scanning; further, the sweep in the case of TF is completed by a single pulse. As such, TF offers a mode of illumination unavailable to conventional spatial focusing and will likely see to the emergence of novel creative methods for precise non-linear excitation, which we discuss in Subheading 3.

Fig. 4 Imaging process in scattering media. Imaging performed (a) in parallel with a camera; (b) sequentially using point-scanning microscopy; and (c) by multiplexing with single-pixel detection

2.2 Single-Pixel Detection

Widefield imaging at depth requires some form of demixing of scattering in the detection that impedes direct recovery of the signal. To explain the imaging process in a practical sense, let us first consider the storage of a two-dimensional image on digital media. An image that is W pixels in width and H pixels in height can be virtually represented as a 2D matrix of values, vwh, with indices w ∈ 0, 1, ..., W - 1 and h ∈ 0, 1, ..., H - 1. This can be stored in physical memory as a linear sequence of values, vj, that are indexed by j = hW +w. Here, by "value," we mean an 8-bit or 16-bit or any other datatype that describes the intensity or color of each pixel in the image. As a matter of fact, any image, volume, or any other digital "information" can be stored in this linear vectorized fashion. This representation will be used throughout this chapter.

Figure 4 illustrates various imaging processes through scattering media using this linear vectorial form. Figure 4a shows conventional imaging with a camera. Here, the sample (x = {xj : j = hW +w}) is represented in discretized form, corresponding to its mapping onto the pixels of the camera. Imaging with a camera, in an ideal case, is the mapping of the sample onto an image (y = {yj : j = hW + w}), which can be mathematically represented as y = Ix, where I is the identity matrix. In turbid media, however, light that travels from the sample is scattered and is detected by nearby camera pixels. This cross-talk leads to a loss in spatial information. At depths exceeding one mean-free path length, the contribution to yj from xj is lower than the cross-talk from other locations in the sample, thus obscuring the image.

Point-scanning methods can overcome this effect of scattering in detection by sequentially illuminating a tight spot in the sample and collecting the total signal using a single-pixel detector, for instance, a photomultiplier tube. Figure 4b illustrates this process. Scattering in the detection is no longer relevant since the total signal is detected. The spatial information is carried by the illumination. Similarly to camera imaging, we can represent the imaging process as y = Ix, with y formed sequentially over time, rather than in one shot.

Figure 4c illustrates the concept of single-pixel imaging, which operates by multiplexing the sample signal. Rather than sequentially probing one location, the signal from all locations is mixed. Scattering in detection is not an issue since the total sum of the signal is measured. The mixing weights are set by a sequence of test functions or structured patterns illuminated onto the sample. Similar to point scanning, a sequence of measurements is needed with different mixing weights. We can describe this arbitrary imaging process as y = Φx, where Φ is a measurement matrix whose rows dictate the pattern of illumination onto the sample (each row is a vector representing a 2D pattern image).

The recovery of the sample, x, from the measurements involves pre-multiplying y by the inverse of Φ, i.e., x = Φ-<sup>1</sup> y. For camera imaging and point scanning, this is trivial as I = I -1 . For single-pixel imaging, the capacity to invert Φ is strongly dependent on the choice and number of the test functions. Let us first consider an orthonormal basis as our measurement matrix, for instance, a Hadamard matrix [65], H, routinely used in single-pixel imaging. Its inverse is well-defined as <sup>H</sup>-<sup>1</sup> <sup>=</sup> <sup>H</sup><sup>T</sup> /N, where N = WH is the total number of pixels in an image. Using this measurement matrix, it is simple to recover x by multiplying the detected signal by the transpose of H. All orthonormal measurement matrices share this property, namely that their inverse is linearly related to its transpose (a trivial mathematical operation). In fact, one could consider point scanning to be a form of single-pixel imaging.

For all these imaging methods, N measurements are required: either with N pixels of a camera or using N sequential recordings on the single-pixel detector. When comparing single-pixel to pointscanning detection, there are several advantages. First, point scanning illuminates a tight spot in the sample with a duty cycle of 1/N. For a maximum allowable irradiance, the signal at the single-pixel detector is typically weak. Using single-pixel detection, the duty cycle is typically 1/2, leading to a stronger signal at the detector. This results in superior signal-to-noise in detection and allows for lower excitation power for the same image quality, reducing photobleaching [13]. Second, widefield illumination of patterns may be used, which is amenable to TF illumination schemes. Third, singlepixel detectors can be used at wavelengths for which camera hardware does not exist or is prohibitively expensive, e.g., electronmultiplying CCDs in the infrared range. The major advantage of single-pixel detection is it allows for compressive sensing, which we describe in the following section. Using compressive sensing, the number of measurements needed to reconstruct x can be significantly reduced by employing a smaller measurement matrix and more sophisticated inversion methods.

2.2.1 Compressive Sensing Compressive sensing (CS) describes the recovery of a signal from far fewer measurements than required by the Nyquist sampling criterion. The idea that one can accurately reconstruct N independent values of a signal x from M ≫ N measurements is counter-intuitive. However, let us consider a similar topic of image compression, where a one megapixel raw image taking 4MB can routinely be compressed into a 400-kB JPEG with imperceptible losses in quality. This process relies on the idea of "sparsity." Any image x in the Cartesian coordinate space can be represented in a different domain Ψ by coefficients s, such that x = Ψs. Images are considered sparse when there exists a domain where the coefficient vector s possesses only a few non-zero values. The number of non-zero values, K ≫ N, indicates that the image is "K-sparse," and it follows that the remainder N - K values are zero. In practice, images are not perfectly sparse; however, s possesses a few large coefficients with the rest being close to zero. For example, JPEG compression selects Ψ to be the discrete cosine transform (DCT); very briefly, a DCT is performed on x and the largest 10% of values are stored.

> CS, developed independently by Candes et al. [66] and Donoho [67] in 2006, introduced compression directly to the measurement process by leveraging the assumption that the signal to be measured is sparse. This is in contrast to making a full measurement and performing compression at a later time. If we consider a case where the imaging process y = Φx is compressed by performing 10% of the measurements needed to achieve Nyquist sampling (M = N/10), such that Φ has N columns and M rows. It is evident that this problem is underdetermined and the solution space of all possible x that can generate y is infinite. For instance, in point scanning, this would be equivalent to imaging the first 10% of the field of view and leaving the other 90% up to interpretation. CS approaches this problem by a careful design of the measurement process. We realize that x can be decomposed into a different domain with a sparse coefficient vector, modifying the imaging process to y = Φx = ΦΨs = Θs [68] (Fig. 5). A solution to the CS problem lies in finding the most sparse s that satisfies y = Θs. The major challenge in CS is twofold: first, is finding an efficient minimization algorithm that can quantify the "sparsity" of s; and, second, is the selection of an appropriate measurement matrix Φ. We consider these aspects in turn.

> In early CS work, a minimization algorithm using an l1-norm proved to be effective in finding a sparse representation of s and thus in recovering a compressed signal [66, 67]. Using this approach, the estimated image x^ is recovered as

$$
\begin{bmatrix} Y\_1 \\ \vdots \\ \dot{Y}\_m \end{bmatrix} = \begin{bmatrix} \Phi\_{11} & \Phi\_{21} & \dots & \Phi\_{n1} \\ \hline \vdots & \vdots & \ddots & \vdots \\ \Phi\_{1m} & \dot{\Psi}\_{2m} & \dots & \Phi\_{nm} \end{bmatrix} \begin{bmatrix} \Psi\_{11} & \Psi\_{21} & \dots & \Psi\_{n1} \\ \Psi\_{12} & \Psi\_{22} & \dots & \Psi\_{n2} \\ \vdots & \vdots & \ddots & \vdots \\ \Psi\_{1n} & \dot{\Psi}\_{2n} & \dots & \dot{\Psi}\_{nn} \end{bmatrix} \begin{bmatrix} \Theta \\ \vdots \\ \Theta\_{1} \\ \vdots \\ \Theta\_{k} \end{bmatrix}
$$

$$
\begin{bmatrix} \theta\_{11} & \theta\_{21} & \dots & \theta\_{n1} \\ \theta\_{12} & \theta\_{22} & \dots & \theta\_{n2} \\ \vdots & \vdots & \ddots & \vdots \\ \theta\_{1n} & \theta\_{2n} & \dots & \theta\_{nn} \end{bmatrix} \qquad \begin{array}{c} \text{Nullally} \\ \begin{bmatrix} \theta\_{1} & \theta\_{21} & \dots & \theta\_{n1} \\ \theta\_{1} & \theta\_{2} & \dots & \theta\_{n2} \\ \vdots & \vdots & \ddots & \vdots \\ \theta\_{1} & \theta\_{2} & \dots & \theta\_{n2} \\ \vdots & \vdots & \ddots & \vdots \\ \theta\_{1n} & \theta\_{2n} & \dots & \theta\_{n2} \end{bmatrix} \qquad \begin{array}{c} \text{Multiply induction} \\ \begin{array}{c} \theta\_{1} \\ \vdots \\ \theta\_{1} \end{array} \end{array}
$$

Fig. 5 Visual illustration of the compressive-sensing process in matrix form

P x^= Ψ argmin jjsjj<sup>1</sup> , s:t: <sup>Θ</sup><sup>s</sup> <sup>=</sup> <sup>y</sup> , and jjsjj<sup>1</sup> <sup>=</sup> <sup>N</sup> <sup>i</sup> <sup>=</sup> <sup>1</sup>jsi<sup>j</sup> : <sup>ð</sup>4<sup>Þ</sup>

In other words, we assume that s is sparse if its l1-norm is small; thus, we find the s with the smallest l1-norm that still satisfies the imaging process Θs = y. This optimization can be solved using basis pursuit methods [66–68]. A convenient MATLAB toolbox is provided by Candes et al. <sup>1</sup> alongside their original publications [66, 69]. Several alternative efficient algorithms have been proposed [70–73]. Typically, these methods sacrifice some aspect of accuracy to gain a substantial advantage in speed. We examine one of such algorithms in a later section.

Let us now consider the choice of the measurement matrix. The CS problem can be solved with high accuracy using Eq. 4 provided that the number of measurements taken exceeds the sparsity, i.e., M ≥ K [69]. Additionally, the problem is termed well-conditioned if no two coefficients in s are sampled by a similar sequence of test weights, i.e., no two columns of Θ are the same. More formally, the measurement matrix should satisfy the restricted isometry property (RIP) [66]. A more intuitive method to verify the suitability of Φ for CS is using the mutual coherence metric, defined as <sup>c</sup> <sup>=</sup> max <sup>i</sup> <sup>≠</sup> <sup>j</sup> <sup>j</sup>ϕ<sup>T</sup> <sup>i</sup> <sup>ϕ</sup><sup>j</sup> <sup>j</sup> , where <sup>ϕ</sup><sup>i</sup> is the <sup>i</sup>-th column of <sup>Φ</sup>. A low mutual coherence implies that the columns are nearly orthonormal. Matrices Φ that satisfy RIP are difficult to generate deterministically [68]. Fortuitously, randomly generated matrices, for example, with Gaussian or Bernoulli distributions, are highly likely to satisfy RIP and mutual incoherence [74]. Thus, random generation can be found at the heart of many CS methods.

<sup>1</sup> https://statweb.stanford.edu/~candes/software/l1magic/.

While CS is broadly applicable to many signal processing techniques, its use in microscopy is met with additional constraints and challenges. In particular, s is not ideally sparse, and the imaging process introduces additional noise to the measurements. Thus, CS is typically an accurate but lossy estimation. Additionally, the projection of Φ into the imaging plane is limited by the finite spatialfrequency bandwidth of the optical system and diffraction and scattering. We discuss this in the context of imaging and propose modified measurement matrices and recovery methods in the following section.

2.3 TRAFIX TempoRAl Focusing microscopy with single-pixel detection (TRA-FIX) [13] combines the capacity to project widefield depthsectioned patterns through scattering tissue with multiplexed detection and compressive sensing, realizing widefield imaging at depth. This configuration, in particular, has strong prospects for rapid, low-photodamage imaging and is well-positioned to provide utility in neuroimaging and developmental studies that cannot be offered by conventional point-scanning multiphoton technologies. 2.3.1 Setup

> The practical implementation of TRAFIX involves a diffraction grating to enable temporal focusing, a single-pixel detector, and a dynamic light shaping element, for instance, a spatial light modulator [13] or a digital micromirror device [58], for the sequential projection of test patterns, ϕi. Figure 6 illustrates a typical TRAFIX configuration. Briefly, an ultrashort laser pulse illuminates a widefield area on the dynamic light shaper (DLS) using a beam expander

Fig. <sup>6</sup> TRAFIX setup, illustrating the projection of <sup>a</sup> sequence of patterns (ϕi ) and single-pixel detection of a series of measurements (yi ). BE: beam expander; DLS: dynamic light shaper; RL: relay lens; DG: diffraction grating; L: lens; EP: entrance pupil; Obj: objective; S: sample; DM: dichroic mirror; SPD: single-pixel detector. Numbers (1–3) correspond to the image plane on the DG, the common Fourier plane, and the sample image plane, respectively

(BE). The selection of the light source is discussed in Note 1. A sequence of patterns (ϕi) is displayed on the DLS and relayed by a relay lens (RL) onto the diffraction grating (DG). The DG sets up spatial dispersion at the entrance pupil (EP) of the objective (Obj) using the lens (L). The Obj temporally focuses the patterns through the sample (S). The total fluorescence signal is filtered by a dichroic mirror (DM) and collected by the single-pixel detector (SPD) into a measurement vector yi.

A DLS is used to sequentially shape the light field into a series of test patterns. A beam expander is used to illuminate a widefield region of the DLS and to efficiently utilize the available pixels. The DLS can be embodied by a spatial light modulator (SLM) by encoding each pattern with a blazed grating, i.e., multiplying each pattern in the Cartesian space, ϕi→ϕxy, by a wrapped linear phase <sup>γ</sup> xdx <sup>+</sup> ydy (mod 1), where dx and dy are spatial frequencies of the grating. Additionally, assuming ϕ ∈ [0, 1], an orthogonal (90<sup>∘</sup> ) dispersion of complementary patterns via ϕxyγ + (1 - ϕxy)γ 0 , where γ 0 xdx - ydy (mod 1), leads to a clearer separation of the light-field energy in diffracted orders. Following the SLM, the first diffracted order should be selected by spatial filtering implemented using a pinhole. Alternatively, a digital micromirror device (DMD) can be used to directly deflect a binary pattern. DMDs, typically, have faster projection rates in the 10-kHz range but are limited to binary (on–off) light shaping and may exhibit high loss. Nematic liquid crystal SLMs enable greater control over the light field, including grayscale patterns, however, at sub-kHz rates. Despite the high maximum speed of these devices, a practical limit exists in the speed with which the test patterns can be generated and sent to the hardware. This throughput limit is set by the efficiency of the software, which is unlikely to be met by scientific toolsets such as MATLAB or ImageJ.

f The plane of the DLS forms the first image plane of the system. The image plane is relayed to a diffraction grating that enables TF. The DG is conjugate to the back aperture of the objective (common Fourier plane of the 4f system), such that spatial frequencies of each test pattern are linearly dispersed along one axis according to the finite bandwidth of the laser pulse. In the previous sections, we considered TF of a Gaussian beam (Eq. 1). However, widefield test patterns used for single-pixel imaging possess a breadth of spatial frequencies. Thus, a consideration has to be made on the finest spatial frequency in the patterns, relating to lateral resolution, and the extent of spatial dispersion for TF. Previous work utilized TF dispersion aspect ratios (β) o approximately 4–8 [13, 58].

We consider that a given test pattern comprises a superposition of high-spatial-frequency and low-spatial-frequency light fields. The high-frequency component occupies a large space in the Fourier plane and is spatially dispersed to an effectively low extent, i.e., a small relative β = αΩ/s, where s is large. The low-frequency component occupies a small region of the Fourier space and is dispersed to a large extent. At the sample plane, the high-frequency light field is axially sectioned due to the tight spatial focus, in a similar fashion to diffraction-limited point-scanning microscopy. Its robustness through scattering is established by the high NA. The low-frequency light field is sectioned via TF and propagates through scattering media due to the effective increase in the NA from the spatial dispersion. Thus, if the combination of the spatialfrequency bandwidth of the test patterns and the spatial dispersion from TF fills the entire back aperture of the objective, axial sectioning over a widefield and robust propagation in scattering media will be achieved.

The selection of relative dispersion, or β, is achieved by choosing an appropriate DG period and the objective and lens pair. For instance, a 1200 g/mm reflective blazed grating, a 400-mm focal length lens, and 20x 0.75NA and 100x 0.7NA objectives (Nikon, Japan) were used in the original studies of TRAFIX [13, 58]. Practically, a 4f relay system could be introduced between the DG and Obj to provide an additional degree of freedom in magnification. The RL system between the DLS and DG can be used to independently control the magnification of the test pattern without affecting the TF dispersion.

The detection system is equivalent to a conventional pointscanning system, where an appropriate wavelength filter directs the total fluorescence signal to a widefield single-pixel detector. Typically, a photomultiplier tube (PMT) may be used and then low-pass filtered to double the pattern projection rate [14, 58]. The initial demonstration [13] utilized an electronmultiplying CCD as an SPD by integrating the total signal, which provides the added capacity for conventional widefield imaging. For particularly sensitive applications, photon-counting devices may be implemented.

2.3.2 Imaging Initial work employed the Hadamard matrix as the set of test patterns (Φ) [13, 14, 75, 76]. The Hadamard matrix [65] (also termed the Walsh–Hadamard matrix) is formed recursively. It is orthonormal and symmetrical, with each column being orthogonal to any other. It is routinely used for single-pixel imaging because it is easy to generate and it is its own inverse. The first-order Hadamard matrix is unity; the second order is given as

$$H\_2 = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix};\tag{5}$$

and the 2n order is formed from the n order as

Fig. 7 Images recovered using TRAFIX. (a) Reference image of a fluorescent test sample with no scattering, (b) imaged through a 400-μm brain slice using widefield detection, and (d) TRAFIX. (c) TRAFIX through a 200-μm brain slice. (e,g) Reference images of mouse-derived astrocytes compared with (f,h) TRAFIX. Scale bar is 20 <sup>μ</sup>m. (a–d) Adapted from [13]

$$H\_{2n} = \begin{bmatrix} H\_n & H\_n \\ H\_n & -H\_n \end{bmatrix} \cdot \tag{6}$$

The sizes of Hadamard matrices are limited to powers of 2, which set the allowed image sampling sizes. Each row of the Hadamard matrix represents an individual test pattern.

Figure 7 shows examples of images generated with TRAFIX using Hadamard test sequences [13]. Figure 7a shows a reference image of a test target (x) fabricated from a 200-nm spun-coated layer of super-yellow polymer. The target was imprinted by photobleaching a negative pattern. When the test target is obscured by a 400-μm section of rat brain tissue (mean-free path length, ls = 55 μm), conventional widefield detection using a camera is impossible due to scattering in the detection (Fig. 7b). Using TRAFIX, the image may be retrieved when obscured by 200-μm (Fig. 7c) and 400-μm (Fig. 7d) section of rat brain tissue.

Figures 7e–h demonstrate the proof-of-principle of TRAFIX on primary mouse astrocytes. No scattering was introduced; however, the results demonstrate the capacity to reconstruct images from the faint signal in biological samples. Importantly, a long recording time per test pattern on the order of 0.1–0.5 s was required due to the low pulse energy density of the fast-repetitionrate (80 MHz) laser used. However, even with a non-ideal laser, three-photon signal recovery was demonstrated with TRAFIX [75]. Wadduwage et al. [15] built on this concept and demonstrated a stronger signal and faster widefield detection of mouse muscle at depth by using a high-pulse-energy laser (see Note 1) and a hybrid demixing method.

The capacity for CS gives TRAFIX an advantage in the ultimate imaging speed and in the reduction to photodamage. However, a special consideration has to be made to the implementation of CS in microscopy compared to conventional macroscopic compressive imaging (e.g., photography). Typically, applications of CS in imaging employ Hadamard or randomly generated matrices. The test patterns from those measurement matrices possess a broad spatialfrequency bandwidth that is difficult to propagate through high-NA imaging systems. The exceptionally high, step-wise variations in intensity that are characteristic to these test patterns, even at low pixel sampling N, lead to large diffraction effects and an overfilling of the back aperture of the objective. Figure 8 shows a simulation of the field intensity at the DG (1), the common Fourier plane (2), and the sample (3), corresponding to the locations marked in

Fig. <sup>8</sup> Compressive sensing in microscopy. (a–c) The projected pattern (1), the spectrum in the Fourier plane (2) clipped by the objective pupil (blue circle), and the resulting pattern in the sample plane (3), for the (a) Hadamard, (b) random, and (c) Morlet patterns. (d) Performance of compression (M/N) using each pattern set, compared to the reference image (e). Scale bar is 10 <sup>μ</sup>m. Adapted from [58]

Fig. 6. The blue circle indicates the entrance pupil size. The Hadamard pattern (Fig. 8a) demonstrates a large structured pattern in the Fourier plane that exceeds the entrance pupil. The random pattern (Fig. 8b) shows a broad specular bandwidth. In both cases, the sampling pixel size of each pattern was set as the diffraction limit.

The limit in the spatial-frequency bandwidth leads to two important issues in CS. First, the effective low-pass filtering leads to a different pattern being projected onto the sample plane to the one assumed in the CS optimization algorithm. This is clear by comparing locations (1) and (3) in Fig. 8. In this scenario, the CS problem becomes more poorly conditioned; the noise in the measurement process may be attributed to larger non-existing frequencies. Second, diffraction leads to a large portion of the pulse energy to be blocked by the entrance pupil. In the widefield regime, pulse energy is an important parameter to maximize, directly impacting the signal-to-noise ratio.

An alternative pattern set was proposed to control and selectively probe the spatial-frequency space, generated from Morlet wavelets mathematically convolved with randomly generated matrices [58, 77]. These Morlet patterns feature two important properties. First, the Morlet wavelet, based on real-valued Gabor filters, reaches the limit set by the uncertainty principle, i.e., an optimal trade-off between spatial and spatial-frequency localization. Second, the convolution with a randomly generated, Gaussiandistributed matrix satisfies the mutual incoherence property required by CS.

Figure 8c shows the propagation of a Morlet pattern [58]. It is evident that the Morlet patterns can be designed to fit the entrance pupil, and the pattern at the sample plane resembles the pattern projected by the DLS. In this demonstration, the sampling pixel size N matches the theoretical diffraction limit. Interestingly, the CS sampling pixels size sets the image resolution, yet, due to the Nyquist sampling criterion, a bandwidth of twice the pattern frequency is required to propagate the pattern through a microscopy system. This corresponds to the observations in Fig. 8a–c (2) and suggests that an image resolution below twice the diffraction limit is not achievable with Hadamard or Random measurement matrices. Optionally, this can be overcome with digital microscanning techniques [78], by taking multiple CS measurements with a series of patterns, spatially shifted by half the sampling size. Morlet patterns, due to the independent selection of bandwidth and sampling size, are able to reach the diffraction limit.

CS recovery from microscopy data requires additional consideration. The conventional method of CS recovery using l1-norm minimization using the basis pursuit algorithm is inefficient for imaging applications for several reasons. Importantly, the CS problem is linearized into vector form (Eq. 4), which makes it broadly applicable to signal processing; however, the 2D/3D nature of imaging can be exploited to improve CS recovery. We can assume some extent of spatial smoothness to the image. Further, the basis pursuit method scales with the total sampling pixels N, making the recovery of large images (e.g., exceeding 64 × 64 pixels) computationally taxing. Finally, the l1-norm minimization is constrained to provide a solution that strictly satisfies y = Φx. It is inevitable that some noise is introduced in imaging, especially when the multiphoton signal from the sample is faint; thus, concessions for noise should be made. This is particularly important in the stability of measurement bases, which we discuss in Note 3. Taking the above into account, first-order approximations to the CS problem may be used to achieve satisfactory, and in some instances improved, performance. In particular, we can utilize Nesterov's method [73] (NESTA2 ), which uses a smooth approximation to the l1-norm, minimizing ||s||1 s.t. ||y - Φx||2≤ε, for some estimated measurement noise ε. NESTA was found to be substantially faster, less hardware demanding, and more resilient to measurement noise [58].

Figure 8d demonstrates the CS performance of the measurement matrices in TRAFIX [58]. The sample comprises 4.8-μm green fluorescent polystyrene beads (G0500, Thermo Scientific) embedded on a glass microscope slide. A reference image is provided in the top right of Fig. 8d. Various compression ratios (M/N) are evaluated. The Morlet exhibits a superior performance at high compression. The pulse energy density is kept constant between measurements; thus, the higher intensity in the Morletgenerated results suggests a higher signal to noise. In fact, equivalent image contrast is achieved at 25%, 67%, and 82% compression, for the Hadamard, Random, and Morlet patterns, respectively [58]. Additionally, the Morlet features a stronger performance through scattering media [58].

#### 2.3.3 Hybrid Demixing An interesting alternative approach, combining patterned illumination and widefield camera detection, toward the demixing of scattering was presented by Wadduwage et al. [15]. This technique was termed de-scattering with excitation patterning (DEEP). The principle is conceptually similar to another technique presented by Parot et al. [79], termed compressed Hadamard imaging (CHI). We recognize that camera detection of TF signal at depth is obscured by the cross-talk of scattering between adjacent pixels. However, the contribution due to scattering of the signal from xi to an imaging pixel yj, where the coordinate j is substantially far away

<sup>2</sup> https://statweb.stanford.edu/~candes/software/nesta/.

from i, is minimal. Therefore, if cross-talk can be removed between adjacent pixels, rather than completely from the entire imaging process, depth imaging can still be achieved.

The DEEP or CHI operates by projecting a sequence of small Hadamard codes (e.g., 16 × 16 pixels), repeated and tessellated over a large-pixel-count image. The mapping of the patterns onto the camera pixels is performed by imaging a calibration phantom (a thin fluorescent layer). The calibration set is then used to demodulate patterned recording sequences. In effect, this method realizes a parallel version of single-pixel imaging and is equivalently amenable to CS [15, 79]. Additionally, this can be performed in a linescanning configuration [57]. The trade-off exists in choosing the size of the Hadamard code. The length of the code establishes the distance in the camera over which the scattering cross-talk can be eliminated; however, a larger length requires more total measurements.

#### 3 Future Prospects

TF offers a new paradigm of widefield, axially confined multiphoton excitation; yet, as we demonstrate in Subheading 2.1, it shares a common mathematical foundation with point-scanning methods. Point-scanning methods sweep the field of view with a series of laser pulses using mechanical mirrors. TF, however, does so elegantly by controlling the spatiotemporal evolution of the phase front. In theory, equivalent field of view, axial confinement, and depth penetration can be obtained using both methods. In practice, however, TF can do so with a single pulse, including the 3D excitation via holographic means [64]. As such, TF has become advantageous for rapid surface imaging and precision optogenetics at depth [56].

The capacity for widefield imaging at depth has been introduced with multiplexed detection schemes. Due to its relatively young stage of technological maturity, TF imaging at depth, particularly in two-photon modes, will struggle to achieve the same speed as point-scanning methods within the next several years (see Note 4). The maturation of high-speed light shaping technology and massively parallel detection schemes can improve the prospects of TF, however. For instance, parallel detection by using each pixel of an EMCCD as a single-pixel detector combined with local, repeating pattern projection, could elevate the speed to beyond 30 fps with short measurement matrices.

Recently, three-photon microscopy has re-emerged due to the availability of near-infrared, high-pulse-energy lasers, three-photon fluorescent markers, and a desire to perform deep, volumetric imaging in vivo and through scattering media, such as a mouse skull [7, 80, 81]. There, due to the low repetition rate of the lasers, the speed of point scanning is on par with multiplexed detection methods (see Note 4). Toward this end, the promise of threephoton microscopy has already been shown with TRAFIX [75]. In vivo, through-skull imaging is an important goal for three-photon technology [3]. The lowered goal posts in speed and the requirement for minimal photodamage present a likely niche for TF imaging. Beyond this, there are several key capabilities that can give TF not only a competitive edge, but also deliver unprecedented performance in select areas, including in neuroscience applications.

A likely key to the success of this technology is the combination of sparse detection and precision excitation. The concept of sparsity allows for a fundamentally lower number of measurements to be made to recover the same information. A clear advantage here comes in the form of reduced photobleaching and photodamage [13] to point-scanning methods. TF may find immediate utility in applications that emphasize long-term, non-invasive imaging over rapid detection, for instance in developmental studies.

Controlled excitation of arbitrary fields in 3D is possible with TF holographic methods [24, 61–63, 82]. The combination of controlled excitation and multiplexed detection lends itself to adaptive sampling schemes. For instance, a low-resolution volume may be formed rapidly; then, based on the features of the low-resolution volume, sparse volumes of interest may be dynamically probed at higher resolutions on-the-fly. This would be of great advantage to sparsely populated samples, for instance, for the tracking of individual cellular bodies in 3D biomaterials. In neuroscience, in particular, the combination of holography and multiplexing may enable the volumetric tracking of individual sparse neurons and their connectivity at high spatiotemporal resolutions with no a priori information to their locations. In fact, a similar compressive approach was demonstrated using light-field microscopy [83].

Optogenetic excitation with TF has already shown great promise [56]. Naturally, the addition of multiplexed detection may enable all-optical functional imaging. For instance, optogenetic excitation with TF can be combined with TRAFIX recordings of a reporter (e.g., a voltage sensor), given that the absorption and emission spectra are well-separated. The detected temporal response could be demodulated based on the illumination pattern sequence using the CS method, however, with several caveats.

The key consideration in this endeavor is the repeatability of the neuronal response. One approach to performing CS in this scenario would be to project a pattern and record the multiplexed response over a desired duration. This is then repeated for all other patterns. The caveat is that the response from all neurons should be repeatable each time they are excited; otherwise, CS will not be able to accurately demodulate the signal. It is a big ask for biological samples to act in such a repeatable manner. The other approach would be to project all the patterns within the desired temporal resolution timeframe and repeat this for all time points of the experiment. The added temporal domain of video imaging is amenable to further improvements to the compression ratio, even down to a few percent [84]. This approach would remove the need for repeatability; however, fast pattern projection and fluorescent reporter switching rates are needed. This is readily achievable for calcium imaging, where the desired temporal resolution is on the sub-second to second scale [5]. For voltage sensing on the millisecond scale [85–88], pattern projection and reporter switching rates should be on the microsecond scale. This may be presently achievable for sparse simultaneous measurements on the order of tens of neurons; however, widefield imaging is currently intractable. An ambitious vision could be to map the volumetric structure of a neuronal sample with an adaptable resolution, identify and optically stimulate each neuron, and map the sparse connectivity to its neighbors with functional imaging. In fact, the idea of sparsity and adaptive resolution is well-suited to neuroscience methods, where neuronal connectivity is naturally sparse over a widefield and in vivo behavioral stimuli might only affect a subset of the interrogated volume.

These considerations form a grand challenge for TF imaging in neuroscience. Despite the appreciable challenges in the road ahead, the holy grail of this undertaking is the unprecedented capacity for low-photodamage, highly parallel, adaptive and rapid interrogation of neuronal signals, and their connectivity in turbid 3D structures, and potentially in vivo.

#### 4 Summary

Compressive sensing and temporal focusing have seen remarkable progress in the past decade, in their own right. The combination of these two techniques in widefield imaging at depth provides a broad canvas for powerful future advances, both in the novel physics of the spatiotemporal manipulation of light and in the new computationally driven paradigm of imaging. Undoubtedly, from the rapid uptake and interest of this relatively young field, many developments are still in store over the coming years. However, a grand challenge remains to fully exploit the multiplexing capacities in the illumination and the detection methods, to provide capacity not available to serial point-scanning approaches, and to demonstrate it in biological samples.

#### 5 Notes

#### 1. Ultrashort Pulsed Laser Selection

An important practical consideration is the choice of the laser source. For comparison, a laser source for two-photon point-scanning microscopy typically features a fast repetition rate ( 80 MHz) with pulse energy densities at the sample in the range of 0.1–0.5 nJ/μm<sup>2</sup> [89]. Incorporating the same laser in a widefield configuration leads to a large loss in energy per unit area, for example, at the maximum power setting of a point-scanning laser (Chameleon Ultra II, Coherent Inc., USA), the pulse energy at the sample is 6-nJ, and the pulse energy density is 0.6 pJ/μm<sup>2</sup> [13] for a modest 100-μm field of view. This is an order of magnitude lower than point scanning, which makes imaging biological samples with weak multiphoton signals a significant challenge. At the other end of the scale, low-repetition high-power lasers can be employed. In Wadduwage et al. [15], a 10-kHz repetition rate regenerative amplifier laser provided a remarkable 1.5 μJ pulse energy with 0.06 nJ/μm2 over their 160-μm field of view; however, the laser in that instance was not tunable in wavelength. Pointscanning methods require a high repetition rate to ensure a fast raster scanning speed across the sample. For widefield imaging, this is not required; thus, low-repetition high-pulse-energy lasers are preferred for a good signal-to-noise ratio.

#### 2. Chirp Evolution in Temporal Focusing

Following Durfee et al. [59], we examine TF from the perspective of position-dependent spectral chirp. TF is considered as the superposition of paraxially propagating Gaussian "beamlets" modified by ray optics to incorporate the tilt of the spectral components. Consider a Gaussian beamlet that undergoes a tilt from spectral dispersion, transforming the lateral position x →x - z sin θ<sup>x</sup> , where θ<sup>x</sup> = αΔω/f. The amplitude is given as [59]

$$A(\varkappa, z, \alpha) = E\_0(\alpha) \frac{\mathfrak{s}\_2}{\mathfrak{s}\_2(z)} e^{-\frac{(\varkappa - z \sin \vartheta\_\varkappa)^2}{r^2(z)}},\tag{7}$$

and the phase as

$$\phi(\varkappa; z, \varkappa) = k\_0 \varkappa \sin \theta\_\varkappa + k\_0 z \left( 1 - \frac{1}{2} \sin^2 \theta\_\varkappa \right) - \eta(z) + k\_0 \frac{\left( \varkappa - z \sin \theta\_\varkappa \right)^2}{2R(z)}, \tag{8}$$

where s2(z) is the axially dependent beam radius at the focus, R is the radius of curvature, and η is the Gouy phase, given with respect to the Rayleigh range, zR = k0s<sup>2</sup> <sup>2</sup>=2, as

$$s\_2(z) = s\_2 \sqrt{1 + \frac{z^2}{z\_R^2}}, \quad R(z) = z \left(1 + \frac{z\_R^2}{z^2}\right),$$
 
$$\text{and} \quad \eta(z) = \arctan\left(\frac{z}{z\_R}\right).$$

The position-dependent chirp can be evaluated from the phase in Eq. 8 by taking the derivative with respect to ω = ω<sup>0</sup> + Δω, evaluated around the central frequency ω0. The first-order chirp is [59]

$$\phi\_1(\varkappa, z) = \frac{z}{c} + \frac{\varkappa}{\varkappa\_2} \beta \tau\_0 (\frac{1}{1 + z^2/z\_R^2}) + \frac{\varkappa^2}{2cR(z)} \,, \tag{9}$$

where β = αΩ/s represents the aspect ratio of the spatial dispersion rate with respect to the beam size (s) at the common Fourier plane and τ<sup>0</sup> is the transform-limited pulse width. Notably, the first term is the arrival time of the pulse, and the last term is symmetric with x and represents the curvature of the beam. The middle term is linear with x and represents a characteristic trait of TF—a pulse front tilt (PFT) [90]. Note that a linear phase shift in the spectral domain relates to a temporal shift in the time domain via the Fourier shifting property. As such, PFT describes the property of TF whereby the pulse rapidly sweeps across the lateral dimension of the focal plane.

Similarly, the second-order chirp is given as [59]

$$\phi\_2(\varkappa, z) = (\frac{\varkappa}{\varepsilon\_2}\frac{\tau\_0 \beta}{\alpha\_0} - \frac{z}{z\_R}\frac{\tau\_0^2 \beta^2}{4})(\frac{1}{1 + z^2/z\_R^2} \dots) \tag{10}$$

Notably, ϕ<sup>2</sup> is dominated by the quadratic term, which is negligible at the focus and increases with z, with a linear relationship when z is well within the Rayleigh range zR = k0s 2 <sup>2</sup>=2. This second-order chirp leads to pulse broadening from temporal dispersion away from the focus (illustrated in Fig. 2).

#### 3. Stability of Compressive Sensing

Like many other inversion methods, the efficacy of CS relies on several closely linked parameters: the mutual incoherence and the condition number of the measurement matrix, and the noise in the measurement. The interplay between these parameters dictates the qualities seen in the recovered images. For instance, a compressed Hadamard matrix is not mutually incoherent (in fact, it is perfectly coherent, i.e., there will be at least two columns that are identical); however, it is wellconditioned. As a result, CS with a Hadamard matrix leads to superior performance when the signal-to-noise ratio is low; however, the image may exhibit features that repeat spatially and, overall, will demonstrate poor performance with high compression. A random matrix is mutually incoherent and has a satisfactory condition number with compression. Thus, it will perform well with compression and will fail only when the signal-to-noise ratio is exceptionally low. As a result, the random matrix is often the basis of choice for many CS applications. A Morlet matrix is mutually incoherent; however, it has a poor condition number when the spatial bandwidth is highly constrained. While it is superior in microscopy applications due to the limited spatial bandwidth, it requires a strong signal-tonoise ratio, i.e., a strong fluorescent signal from the sample.

#### 4. Imaging Speed

Multiplexed widefield detection and CS promise a reduction in the total measurements and a capacity to capture rapid dynamic processes; however, the proof-of-principle demonstrations [13–15] have not yet shown a faster imaging speed compared to state-of-the-art point-scanning methods. This is largely due to the discrepancy in the maturity of the hardware and control software that drive these methods. Here, we explore the theoretical speeds that can be achieved by both methods.

In point-scanning two-photon microscopy, the imaging speed is limited by the repetition rate of the laser, which is 80 MHz for typical embodiments. Another limit is set by the scanning speed. High-speed resonant scanning systems can achieve a line scan rate of 12 kHz. Let us define the maximum imaging speed based on a 1 megapixel (MP) image; thus, 12 frames per second (fps) can be achieved practically. The speed of multiplexed imaging is limited by the speed of the light shaper. For an SLM, let us consider a conservative highspeed refresh rate of 300 Hz. For a 1MP image and 10% compression (M/N), 0.003 fps can be achieved. A DMD can reach speeds beyond 15 kHz. Similarly, for a 1MP image and 10% compression, 0.15 fps can be achieved. This is substantially lower than point scanning. Hybrid methods [15, 79] using electron-multiplying CCDs (EMCCD) with 25 fps (1MP image) could reach >1 fps with short Hadamard codes.

In three-photon microscopy, the requirement for high pulse energy restricts the laser repetition rate to typically sub-MHz regimes. For point scanning, this limits imaging speed to below 0.1 fps (in our 1MP comparison). The speed of multiplexed imaging, however, is still limited by the light shaper. As such, multiplexed imaging may have an advantage in speed in widefield three-photon imaging.

#### References


microscanning. Optics Express 24:10476– 10485


microscopy. Opt Commun Opt Life Sci 281: 1796–1805


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## High-Speed Neural Imaging with Synaptic Resolution: Bessel Focus Scanning Two-Photon Microscopy and Optical-Sectioning Widefield Microscopy

Guanghan Meng, Qinrong Zhang, and Na Ji

#### Abstract

Brain is composed of complex networks of neurons that work in concert to underlie the animal's cognition and behavior. Neurons communicate via structures called synapses, which typically require submicron spatial resolution to visualize. To understand the computation of individual neurons as well as neural networks, methods that can monitor neuronal morphology and function in vivo at synaptic spatial resolution and sub-second temporal resolution are required. In this chapter, we discuss the principles and applications of two enabling optical microscopy methods: two-photon fluorescence microscopy equipped with Bessel focus scanning technology and widefield fluorescence microscopy with optical sectioning ability, both of which could be combined with optogenetic stimulation for all optical interrogation of neural circuits. Details on their design and implementation, as well as example applications, are presented.

Key words Optical microscopy, Neural imaging, Synapse, Bessel beam, Two-photon fluorescence microscopy, Widefield microscopy, Structured illumination microscopy

#### 1 Introduction

With the development of genetically encoded fluorescence reporters that convert neural activities into fluorescence fluctuations [1–7], optical microscopy has been widely applied to in vivo studies of brain function due to its non-invasiveness and subcellular spatial resolution. Over the past decades, major efforts have been made to improve the performance of optical microscopes in terms of their resolution, speed, and imaging depth [8–13]. Depending on the specific application, a distinct set of criteria should be applied in order to select the most appropriate imaging modality. In the case where synaptic responses need to be characterized (e.g., mapping all the synaptic inputs of a neuron), the ideal method should have high

Authors Guanghan Meng and Qinrong Zhang have equally contributed to this chapter.

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_10, © The Author(s) 2023

spatial resolution to resolve individual synapses and sufficient temporal resolution to capture all the ongoing activities in the observation area/volume. For studies in opaque samples (e.g., adult mammalian brains), multiphoton fluorescence microscopy is often the method of choice, because its nonlinear excitation leads to optical sectioning and its near-infrared illumination light provides optical access to structures at depth in highly scattering tissue [10–12]. Despite its ability to image scattering tissues, the main limitation of multiphoton microscopy is its speed, which is limited by the point-scanning scheme of its standard implementation [12], especially when applied to threedimensional (3D) volumetric imaging. Bessel focus scanning technology, where a Bessel-like beam is utilized for fluorescence excitation [14–17], has been used to improve the volumetric imaging speed of multiphoton microscopy via an extended depth of field (DOF), and is discussed in Subheading 2 of this chapter. In contrast, for studies in transparent samples with minimal tissue scattering, widefield fluorescence microscopy can be applied, with the advantages of high imaging speed and simple hardware implementation (see also Chap. 2). However, widefield microscopy does not have optical sectioning ability [18–21]. As a result, significant out-of-focus signal contributes to a blurry and sometimes overwhelming background, which obscures the in-focus objects, especially small structures like synapses. One approach to impart optical sectioning ability to widefield fluorescence microscopy is via structured illumination (SI) [9, 22–25], where the sample is illuminated with a structured rather than uniform pattern of light to distinguish the in-focus information from the background signal. The details of this technology are discussed in Subheading 3 of this chapter [9].

#### 2 Bessel Focus Scanning Two-Photon Fluorescence Microscopy

2.1.1 Multiphoton Excitation

2.1 Background Widefield fluorescence microscopes with optical sectioning ability enable high spatiotemporal resolution imaging in neurobiological model systems that are transparent (e.g., zebrafish larvae and Drosophila larvae), but cannot visualize deep structures in opaque samples (e.g., adult mammalian brains) due to tissue scattering. In multiphoton excitation fluorescence microscopy, the longer wavelength of excitation light renders the system more resistant to scattering and grants optical access to structures at depths up to 1–2 mm in the mouse brain [10–12]. Furthermore, the nonlinearity of multi-photon excitation can significantly suppress out-offocus fluorescence generation, thus requiring no additional device or computation to realize optical sectioning.

2.1.2 Challenges of Volumetric Imaging with Multiphoton Microscopy

In a conventional multiphoton fluorescence microscope, fluorescence excitation is three-dimensionally confined. Thus 2D or 3D scanning of the focus is required to obtain a planar image or a volume stack, respectively. Such a point-scanning scheme ensures optical sectioning ability and high spatial resolution even in opaque tissue, but limits imaging speed. Therefore, to record neural dynamics with sufficient frame rates, scanners that can move the excitation focus at high speed are desired. Lateral beam steering (translation of the focus in a plane perpendicular to the optical axis) in multiphoton microscopy is typically achieved by a pair of galvanometer-based optical scanners (or "galvos"). Conventional raster-scanning galvos can achieve line scan rates of a few kHz [12]. By replacing one of the raster-scanning galvos with a faster scanner (e.g., resonant galvos [26]), line scan rates of tens of kHz and 2D frame rate of tens of Hz can be realized [12]. However, information observed from a single optical section can be incomplete. Since both neural circuits and individual neurons therein are 3D structures, volumetric imaging with subsecond temporal resolution is required to capture their dynamics completely. To perform volumetric imaging in multiphoton microscopy, in addition to scanning the focus in the lateral xy plane, axial movement of the excitation focus along z direction is required. In contrast to lateral focus scanning, axial scanning of the focus is more challenging. A common method to scan the excitation focus axially is to translate the microscope objective along z direction, the speed of which, however, is limited by mechanical inertia associated with the objective. Wavefront shaping devices such as electrically tunable lenses (ETLs) [27], spatial light modulators (SLMs) [28], or acoustooptic deflectors (AODs) [29–31] allow the control of wavefront divergence at the objective back focal plane, which translates the focus axially. These methods avoid the problems posed by objective inertia, but introduce additional aberrations to the system and degrade image quality at large focal shifts, since most microscope objectives are designed to produce optimal performance for light of a specific divergence (e.g., multiphoton excitation objectives are designed for collimated beams) [32]. A dual-objective remote focusing method, where the excitation light travels through two microscope objectives with conjugated pupil planes [33–35], has aberrations caused by the two objectives cancelled out and can achieve high-speed volumetric imaging at diffraction-limited resolution over a large axial range. The requirement of two objectives nevertheless increases the system cost and complexity, requiring custom optical design [36]. Moreover, all the above volumetric imaging methods are sensitive to axial motions of the sample, which cause the objects of interest to move out of focus and the loss of dynamic information. Bessel focus scanning multiphoton microscopy provides an alternative to the volumetric imaging methods described above [15, 16, 37, 38] and will be discussed in detail below.

Fig. 1 The concept of Bessel focus scanning multiphoton microscopy. (a) 2D scanning of a conventional Gaussian focus obtains information from a single plane (the plane in yellow). (b) 2D scanning of a Bessel focus covers a volume (the volume in orange)

2.1.3 Bessel Focus Scanning Technology During most brain imaging experiments in vivo, neurons exhibit variations in their fluorescence signal associated with their functional activity, but their positions remain unchanged. Therefore, when the temporal dynamics of neurons is the subject of interest, one does not need to constantly track their 3D locations, but can instead obtain axially projected views of 3D volumes via extended depth-of-field imaging by scanning an axially elongated focus, e.g., a Bessel-like beam [14–16, 39–41] (Fig. 1). Since all structures along the elongated focus are probed simultaneously within a single 2D scan, the 2D frame rate becomes a projected 3D volume rate. Although axial resolution is reduced substantially during Bessel focus scanning, the 3D positions of neurons and neuronal structures can be obtained from a conventional 3D stack by scanning a Gaussian focus. Because at the same numerical aperture (NA), a Bessel focus has a higher lateral resolution than a Gaussian focus [41], synapse-resolving lateral resolution is maintained even for a 0.3-NA Bessel focus [42]. With Bessel focus scanning, imaging throughput can improve by tens to a hundred times, with the image data size reduced by the same amount [15]. Bessel focus scanning technology is compatible with other fast scanning methods mentioned in Subheading 2.1.2 and, when combined together, can further boost the volumetric imaging speed of a multiphoton microscope [15, 38]. The extended depth of field of a Bessel focus also makes the imaging process insensitive to axial motion, which eliminates axial motion artifacts. Together with the much reduced data size, it substantially simplifies image processing [15, 42]. Although Bessel focus scanning technology can be incorporated into both two-photon and three-photon fluorescence

Fig. 2 The Diagram of two Bessel focus modules. (a) A two-photon microscope with an SLM-based Bessel module (gray box). (b) An axicon-based Bessel module. L2 can be translated along optical axis to change the axial length and numerical aperture of Bessel beam. D is defined as 0 mm when the mask is at the front focal plane of L2 and positive when L2 moves away from the mask. Ti:Sa Ti:Sapphire laser, EOM electro-optical modulator, BE beam expander, M mirror, L lens, Obj objective

microscope systems [37], the sections below focuses on Bessel focus scanning two-photon fluorescence microscopy for volumetric imaging.

2.2 Materials and Equipment A typical two-photon fluorescence microscope includes the following optical components (Fig. 2): a femtosecond laser, an excitation power control unit such as an electro-optical modulator (EOM) (i.e., a Pockels cell), a pair of scanners/galvos, a scan lens and a tube lens, a dichroic mirror, an objective (often mounted on a piezoelectric stage to perform axial scanning), detection filters, one or two photomultipliers (PMTs).

> A Bessel focus module includes the following optical components (Fig. 2): two mirrors to switch the light path between Gaussian and Bessel modes, an SLM or an axicon, a lens, an annular mask, and a pair of conjugation lenses.

2.3.1 Two-Photon Fluorescence Microscope

2.3 Methods The essential components in the excitation light path of a two-photon fluorescence microscope are shown in Fig. 2a. The output of a Ti:Sapphire laser with a Gaussian intensity profile first passes through a Pockels cell, and then a beam expander (see note 1 in Subheading 2.4). The Gaussian beam is then directed either straight to the galvos or into a Bessel module, by moving two mirrors M1 and M2 out of and into the light path, respectively. The galvo pair could be either placed close together or conjugated with a pair of scan lenses. The latter can eliminate beam wandering on the second galvo as well as the objective back focal plane during scanning, but leads to more power loss due to adding additional optical elements. In some systems, a resonant scanner is incorporated in addition to the two raster-scanning galvos, with the three scanners together enabling high-speed imaging in a small subfield positioned anywhere inside a large field of view [36, 38]. After the galvos, a scan lens (L4) and a tube lens (L5) conjugate the scanners to the back focal plane of the objective. The emission light path (e.g., the dichroic mirror and the detectors) is not shown in the diagram.

2.3.2 Design and Setup of a Bessel Focus Module An annular illumination pattern at the objective back focal plane generates a Bessel-like focus. This annular illumination can be created with a phase mask [14], an SLM [15, 42], or an axicon [16, 39, 40].

> In an SLM-based Bessel focus module (Fig. 2a, rectangle box), a reflective phase-only SLM is placed at the front focal plane of a lens (L1). A circular binary phase pattern (alternating 0 and π) on the SLM diffracts the incident Gaussian beam preferentially into the 1 diffraction orders, which form an annular ring at the back focal plane of L1. An annular aperture mask is placed at the back focal plane of L1 to selectively transmit the desired annular electric field, which is conjugated to the galvos by a pair of lenses L2 and L3, and then to the objective back focal plane.

> An axicon-based Bessel focus module (Fig. 2b) has a similar configuration to an SLM-based module, except that an axicon is placed at the front focal plane of L1. The conical surface of the axicon refracts the light according to Snell's law, which forms a ring at the back focal plane of L1. The annular aperture mask at the back focal plane of L1 is not necessary for an ideal axicon (with infinitely small conical tip), but is necessary in practice to block the unwanted light refracted through the tip of an imperfect axicon.

> When it comes to choosing between an SLM and an axiconbased Bessel module, several factors need to be considered. An SLM-based module offers more flexibility in terms of point spread function (PSF) engineering [15] and allows the NA and axial length of the Bessel focus to be adjusted independently [15]. Details on designing Bessel foci with different PSF profiles are discussed in Subheading 2.3.3, but a brief overview is presented here. In both methods, the axial length of the Bessel focus can be altered by varying the size of the Gaussian beam impinging on the SLM or the axicon, for example, by adding a beam expander or reducer at the entrance of the Bessel module. With the beam size fixed, a user can adjust the NA and axial length of the Bessel focus independently by changing the phase pattern on the SLM and the dimensions of the annular mask. Therefore, an SLM-based module is ideal for systems utilizing multiple objectives with different NAs (e.g., a 1.05-NA objective for neocortical imaging or a 0.5-NA

microendoscopic lens for deep brain imaging). In contrast, an axicon-based module does not allow users to adjust NA and axial focal length independently, without varying the beam size or introducing a different axicon [39, 40]. Translating one of the conjugation lenses (L2 or L3) along the optical axis concurrently changes the NA and axial length of the Bessel focus [16]. However, despite being less flexible in PSF engineering, the axicon-based Bessel module is nevertheless an attractive alternative for the following reasons: first, its transmissive layout occupies less space and makes it easier to incorporate into an existing system [38]; second, an axicon module costs much less (~\$5000, rather than \$30,000 for an SLM-based module) to set up; third, an axicon works with a larger wavelength range compared with an SLM (e.g., compatible with three-photon microscopy [37]).

2.3.3 Design of a Bessel Module We use MATLAB® to calculate the PSF profiles of Bessel foci and to guide the design of Bessel module. The underlying physics is described below, and the MATLAB codes can be found in Refs. [15, 16].

Generation of Annular Illumination with an SLM (Adapted from Ref. [15])

Concentric binary grating patterns with phase values alternating between 0 and π are applied to a phase-only SLM to diffract most of the incident electromagnetic field into the 1 orders (see note 2 in Subheading 2.4), which after lens L1 forms a ring at the mask plane. The radius of the ring (ρ), determined by the period of the grating (d), the focal length of L1 ( f1), and the wavelength of the light (λ), is calculated from the grating equation as:

$$
\rho = \frac{f\_1 \lambda}{d}.
$$

For an annular mask to transmit the ring and block the other diffraction orders, its inner and outer diameters D<sup>i</sup> and D<sup>o</sup> should satisfy the relation:

$$D\_0 + D\_i = \mathbf{4}\rho.$$

Combining the two equations above, to generate an annular illumination pattern that centers on an annular mask with inner and outer diameters D<sup>o</sup> and Di, the period of the circular binary grating on the SLM is:

$$d = \frac{4f\_1\lambda}{D\_0 + D\_i}.$$

With the size of the SLM pixel defined as p, the period of the circular binary grating in units of pixels S is:

$$S = \frac{d}{p} = \frac{4f\_1 \lambda}{p(D\_\mathrm{o} + D\_\mathrm{i})}$$

:

The thickness of the annulus at the mask plane generated by the above circular binary grating is ~2f1λ/beamD, with beamD being the diameter of the excitation laser on the SLM [43].

Generation of Annular Illumination with an Axicon For an axicon with an apex angle A, the angle of incidence α on the conical surface is: <sup>α</sup> <sup>¼</sup> <sup>π</sup><sup>A</sup> <sup>2</sup> . The refraction angle is then derived using Snell's law: α<sup>r</sup> ¼ nα, given that α is small and n is the refractive index of the axicon. Therefore, the angle between the refracted light and the optical axis is:

$$
\theta = a\_\mathbf{f} - a = (n-1)a.
$$

The radius of the ring at mask plane is:

$$
\rho = \theta f\_1 = (n-1)af\_1.
$$

The annular mask to transmit the ring should again satisfy:

$$D\_0 + D\_i = \mathbf{4}\rho.$$

Therefore, the apex angle and refractive index of the axicon should meet:

$$(n-1)a = \frac{D\_{\rm o} + D\_{\rm i}}{4f\_1}.$$

Calculation of Two-Photon Excitation PSF

Two-photon excitation PSF can be calculated using Richards and Wolf integrals [44, 45], from which both the lateral and axial full width at half maximum (FWHM) of the PSF can be determined. The information required for PSF calculation includes: the wavelength of excitation light, the objective information (i.e., NA, magnification, immersion media), and the electric field distribution at the objective back focal plane.

The following equations hold for all infinity-corrected objectives:

$$FL\_{\rm obj} = \frac{FL\_{\rm nuc}}{M\_{\rm obj}};$$

$$BPD\_{\rm obj} = 2\,NA\_{\rm obj}\,FL\_{\rm obj}$$

where the FLobj is the focal length of the objective, FLtube is the focal length of the tube lens, Mobj is the magnification of the objective, and BPDobj is the back pupil diameter of the objective. The Bessel annulus at the back focal plane with an outer diameter D<sup>o</sup> and inner diameter D<sup>i</sup> gives rise to an excitation NA as:

$$NA\_{\rm Bcscl} = \frac{D\_o}{BPD\_{\rm obj}} NA\_{\rm obj}.$$

Or:

$$NA\_{\text{Besscl}} = \frac{D\_{\text{o}}}{2FL\_{\text{obj}}}.$$

The PSF profile is determined by Do, Di, and the electrical field distribution within the annulus, which is discussed below.

Design of the Annular Aperture Mask in an SLM-Based Bessel Module (Adapted from Ref. [15])

The annular mask is designed to be conjugated to the objective back focal plane, with a magnification factor M, which is determined jointly by focal lengths of lenses L2–L5. In most cases, the objective, L4, and L5 (scan lens and tube lens) are already selected and built prior to the design of the Bessel module as an add-on. Therefore, one only needs to choose L2 and L3 to fit into the available space and conjugate the mask plane to the galvos. It has been demonstrated previously that a 0.4-NA Bessel focus works well for in vivo two-photon fluorescence imaging in the brain [15], a 0.3-NA Bessel focus has better performance than 0.4-NA one when combined with two-photon fluorescence microendoscopy [42] (due to the substantial off-axis aberrations of gradient refractive index lenses), and a larger NA Bessel beam is more suitable for three-photon microscopy [37]. With the objective, L2–L5, and the desired NA selected, the outer diameter of the annular mask is determined as:

$$D\_{\rm o} = \frac{2NA\_{\rm Besscl}FL\_{\rm obj}}{M}.$$

With NA (i.e. Do) fixed, the axial length of the Bessel beam is dictated by the thickness of the ring, which is determined by the focal length of L1 and the beam diameter on SLM, with smaller f<sup>1</sup> and larger beamD generating thinner rings and longer Bessel foci. The axial length of the Bessel focus can be further adjusted by finetuning Di and S.

The electrical field distribution at the mask, including its amplitude and phase, can be visualized via simulation (Fig. 3). The annular mask selectively passes the electric field within Di < r < Do, where r is the radial coordinate at the mask plane, and is usually centered around the largest electric field amplitude (rmax, blue line, Fig. 3). Two cases are of particular interest. In Case I, D<sup>i</sup> and D<sup>o</sup> are selected to have the annulus fall on the two amplitude peaks closest to and on either side of rmax (Fig. 4b, red lines in Fig. 3a, c). In Case II, the edges of the annulus are located at the two zero-amplitude crossings that are closest to and on either side of rmax (Fig. 4e, green lines in Fig. 3a, c). Even though Case I has a thicker annulus than Case II, the resulting Bessel focus has a longer axial FWHM than that of Case II (Fig. 4c, f). This is because the negative parts of the electric field (between the green and red lines, Fig. 3a, c) destructively interfere with the positive parts (between the green lines) and broaden the axial FWHM. Reducing the thickness further from Case II (e.g., between the purple lines) increases the axial FWHM again. As shown in Fig. 4, to ensure that rmax is located at the center of the annulus, without changing D<sup>o</sup> (i.e., the NA of the Bessel beam), the period of the SLM pattern, S, can be concurrently altered with Di.

Fig. 3 Electrical field distribution at the mask plane in an SLM-based Bessel module. (a) Amplitude and (b) phase of the (c) electric field at different radial positions. 0 is the location of the optical axis. The red and green dashed lines in (a) and (c) represent the inner and outer diameters of the annular masks for Case I and Case II discussed in Subheading 2.3.3.4. (This figure is adapted from Ref. [15])

In summary, the two cases discussed above represent typical examples of long and short Bessel focus and can be used as a starting point for the mask design. If space permits, inserting a beam expander or a beam reducer before the SLM can further vary the axial length of the Bessel focus (see note 3 in Subheading 2.4). The MATALB code to facilitate this module design can be found in Ref. [15].

Design of an Axicon-Based Bessel Module When using an axicon for Bessel beam generation, one should start with selecting a high-quality axicon (e.g., 1-APX-2-H254-P, ALTECHNA Inc.; XFL25–010-U-B, ASPHERICON). Once the axicon is selected, the apex angle α is determined (see Subheading 2.3.3.2); thus, the radius of the ring at the mask plane is only determined by the focal length of L1 (Fig. 2b). From here, one

Fig. <sup>4</sup> Profiles of Bessel foci using masks with different ring thickness but the same outer diameter. SLM phase pattern, the dimension of an electric field at annular mask, and a measured profile of a 2 <sup>μ</sup>m diameter fluorescent bead, corresponding to (a–c) Case I and (d–f) Case II in Fig. 3. (g, <sup>h</sup>) Simulated axial profiles of Bessel foci generated by annular masks of different widths. (This figure is adapted from Ref. [15])

can simulate the electric field at the mask plane and the 3D PSF (code can be found in Ref. [16]) to guide the selection of L1, L2, and L3. Example simulation results are presented in Fig. 5. In this simulation, lens L2 is translated along the optical axis. The location where the front focal plane of L2 coincides with the back focal plane of L1 (i.e., the mask plane) is indicated as D ¼ 0 mm (Fig. 5, second column). When L2 is moving toward the mask plane (negative D), more power is allocated to the central region of the objective back focal plane (Fig. 5, first column), which yields a smaller effective NA and, together with the phase distribution of the pupil function, a longer Bessel focus. In contrast, moving the lens away from the mask plane distributes more power to the edge of the objective pupil function, leading to a bigger effective NA and a shorter focus (Fig. 5, third column).

The mean radius of the annular mask becomes well defined once L1 and axicon are both selected (see Subheading 2.3.3.2), although one can jointly vary the outer and inner radii of the ring and obtain different ratios of transmittance and axial profiles. The mask is intended to eliminate the unwanted light passing through the imperfect, typically round, tip of the axicon, which if not blocked can interfere with the rest of the refracted electric field and cause the measured PSF to deviate from simulation. The thinner the annular mask is, the more effectively it can block the unwanted light, but more power loss will be introduced to the system. For the case when limited excitation power is available, one can use a thicker mask, or even no mask, and obtain non-theoretical yet still usable PSF profiles [16].

2.3.4 Alignment of Bessel Module (Adapted from Ref. [15]) The Bessel focus module is located between the laser and the two-photon fluorescence microscope. First, obtain locations of the front and back focal planes of lenses L1–L3 (e.g., from their Zemax models or vendor specifications). Second, place lenses so that the back focal plane of L3 is at the first scanner (or between the two scanners, if they are not conjugated), the front focal plane of L3 is superimposed with the back focal plane of L2, and the front focal plane of L2 is superimposed with the back focal plane of L1 (i.e., the mask). At last, place the SLM or axicon at the front focal plane of L1 (see note 4 in Subheading 2.4). It is helpful to have two mirrors with tip/tilt mounts between L3 and the first scanner to co-align Gaussian and Bessel paths. Alternatively, if there is not enough space, set up one mirror with both tip/tilt and translational controls. Installation of the Optical Components

Gross Alignment of the Bessel and Gaussian Beam Paths Set up two irises in the optical path shared by the Gaussian and Bessel focus modalities between the alignment mirror (s) mentioned in Subheading 2.3.4.1 and the first scanner. They should be sufficiently separated and not conjugated with each other. Ideally, the first alignment iris in the Bessel module should

¼ Fig. <sup>5</sup> Engineering Bessel foci by displacing Lens L2 in an axicon-based Bessel module. (a, <sup>e</sup>, <sup>i</sup>) 2D and (b, <sup>f</sup>, <sup>j</sup>) 1D (along x direction) representations of the amplitude and phase of the electric fields at the objective back focal plane, (c, <sup>g</sup>, <sup>k</sup>) axial, and (d, <sup>h</sup>, <sup>l</sup>) lateral two-photon excitation point spread functions for L2 displacements of D 20 mm, 0 mm, and 20 mm, respectively. (This figure is adapted from Ref. [16])

be placed right after the alignment mirror(s) and the second iris as close to the first scanner as possible to increase the accuracy of the alignment. Adjust the positions of the irises to center on the (well aligned) Gaussian beam. For SLM-based Bessel module, apply a concentric binary grating on the SLM, and the reflected beam appears as concentric rings. Then adjust the mirror(s) between L3 and the first scanner iteratively so that the Bessel concentric rings pass through and center on the two irises (see note 5 in Subheading 2.4).

Fine Alignment For fine alignment, dismount the objective and place a camera (e.g., DCC1545M, Thorlabs) at the back focal plane of the objective. Under the Bessel mode, one should see a ring on the camera. (For a well conjugated system, where the scanners are all conjugated to each other, sweeping one or more scanners should not move the position of the ring on the camera. But if the microscope does not have its scanners conjugated, place the camera at the plane with minimal movement.) Adjust the axial positions of L2 and L3 so that the ring is sharpest on the camera, which indicates that the back focal plane of the objective is conjugated to the annular mask (see notes 6–8 in Subheading 2.4).

Placement of annular filter mask: Place the annular filter mask at the back focal plane of L1. Apply the corresponding concentric binary grating as calculated in Subheading 2.3.3 and adjust first the lateral position of the mask until the post-mask ring is symmetric and then the axial position so that the power passing through the annular filter is maximized (see note 9 in Subheading 2.4).

2.3.5 Results and Data Analysis In this section, representative in vivo demonstrations of Bessel focus scanning 2PFM are presented. In an SLM-based Bessel module, using the method described in Subheading 2.3.4, the Bessel beam of different NAs and lengths was employed to image the cortical neurites of an awake Thy1-YFP line H mouse (Fig. 6). With the increase of NA or axial length of the Bessel focus, more energy is distributed to the side rings, which reduces image contrast. 0.4-NA Bessel foci provide high signal-to-background ratio while maintaining the ability to resolve synapses laterally in two-photon microscopy (Fig. 7). A single 2D scan of a Bessel focus probes all structures from the entire volume (e.g., a depth range of 60 μm in Fig. 7), and the calcium activities in individual dendritic spines can be characterized (Fig. 7c). In contrast, a minimum 36 scans of Gaussian beam (calculated by the ratio of axial range of structures and Gaussian focus axial FWHM) are required to cover the same volume (60 2D scans were used to generate Fig. 7a). By using longer Bessel foci, imaging throughput can be improved by more than 100 folds, with data size reduced by the same factor.

Fig. 6 In vivo images of cortical neurites taken with Bessel foci of different NA and axial lengths, in a headfixed awake Thy1-YFP mouse. An SLM-based Bessel module was used. (a) Images obtained by 2D scans of Bessel foci and their axial point spread functions. (b) A Gaussian image stack with structures color-coded by depth obtained with a 1.05-NA Gaussian focus. (From Ref. [15])

With an axicon-based Bessel module, translating L2 allows Bessel foci of different lengths and NAs to be generated. Functional recording from neurons labeled with calcium indicator dye (Cal-520 dextran) in zebrafish larvae is presented in Fig. 8. With the axial length of the Bessel focus gradually increasing (Fig. 8a–h), more and more structures are brought into focus using a single 2D scan.

As mentioned in Subheading 2.1.3, Bessel focus greatly reduces axial motion artifacts due to its longer axial extent. As a result, only lateral 2D registration is required for correcting sample

Fig. <sup>7</sup> Bessel focus scanning technology improves imaging throughput while maintaining synaptic lateral resolution in calcium imaging of GCaMP6s <sup>+</sup> neurites in an awake mouse brain in vivo. (a) Average intensity projection of an image stack acquired by 60 2D scans of a Gaussian focus (1.05 NA), with structures colorcoded by depth. (b) A single 2D scan of a Bessel focus (0.4 NA, 53 <sup>μ</sup>m FWHM) probes the same volume. Insets: zoomed-in views of dendritic spines. (c) Calcium activity traces from axonal varicosities (putative boutons) and dendritic spines (arrowheads in <sup>b</sup>). (From Ref. [15])

Fig. 8 An axicon-based Bessel module enables 50 Hz volumetric calcium imaging of spinal projection neurons in zebrafish larvae. (a) Image acquired by Gaussian focus scanning at 127 <sup>μ</sup>m from the dorsal surface of the head (relative depth z <sup>¼</sup> 17 <sup>μ</sup>m). (b) Averaged calcium transients of neurons evoked by the acoustomechanical tapping stimuli. (c, <sup>e</sup>, <sup>g</sup>) Volumetric images obtained by scanning a short (14 <sup>μ</sup>m axial FWHM), medium (24 <sup>μ</sup>m axial FWHM), and long (39 <sup>μ</sup>m axial FWHM) Bessel foci, respectively. (d, <sup>f</sup>, <sup>h</sup>) Averaged calcium transients of responsive neurons. (i, <sup>j</sup>) Average intensity projections of a 66-μm-thick image stack acquired by Gaussian focus scanning. Color in (i) encodes relative depth. Eleven trials were averaged in (b), (d), (f), and (h). Shadow represents standard deviations. (From Ref. [16])

motion, instead of the much more computationally intensive 3D registration/interpolation. As a demonstration, a Thy1-YFP mouse was imaged in vivo and the fluorescence traces of the same dendritic ROI measured with Gaussian and Bessel focus scanning, respectively, were plotted (Fig. 9). Since the neurons were labeled with

Fig. 9 Bessel focus scanning is resistant to axial motion artifacts. (a, <sup>b</sup>) Images obtained with 2D scans of a Gaussian focus and a Bessel focus, respectively, from an awake Thy1-YFP line H mouse. (c, <sup>d</sup>) Brain motion (upper panel, quantified as the lateral image displacement with time) causes (c) large changes of fluorescence signal from two YFP+ dendrites (ROI1 and ROI2) in Gaussian focus scanning mode, (d) but not Bessel focus scanning mode. (From Ref. [15])

yellow fluorescent protein (YFP) rather than activity sensors (e.g., GCaMP6), fluctuations in the traces were motion artifacts, which only showed up in Gaussian but not Bessel traces.

2.4 Notes 1. The beam expander after the EOM (Fig. 2a and Subheading 2.3.1) is optional, but included in all our custom-built 2PFM systems. The output from typical laser usually has a small (e.g., ~1 mm) diameter and short Rayleigh length, thus diverging quickly during propagation. Expanding the beam increases Rayleigh length and reduces divergence during beam propagation.


move laterally as it moves up and down. To position the annular ring more easily on the objective focal plane, a laser positioner consisting of two prism mirrors mounted on two translation stages can be included in the system. The motion axes of the two translation stages are orthogonal to each other, which enables both x and y position adjustment.


#### 3 Widefield Fluorescence Microscopy with Optical Sectioning

#### 3.1 Background In standard widefield fluorescence microscopy, excitation light simultaneously illuminates the sample area, from which the emitted fluorescence is collected by a microscope objective and forms an image of fluorescent structures on a camera. The parallel nature of illumination and fluorescence detection enables widefield microscopy to reach high frame rates. However, in standard (i.e., nonlight-sheet illumination) widefield microscopy, emitted fluorescence comes from both structures in the focal plane and structures above and below the focal plane. Its incapability to eliminate the out-of-focus background fluorescence limits its application to thin samples such as cultured cells or ultrathin tissue sections. A powerful approach that imparts optical sectioning capability to widefield fluorescence microscopy utilizes structured illumination (SI) [9, 23–25]. Because a structured illumination has its highest contrast in the focal plane, the in-focus fluorescence image is modulated while the out-of-focus background fluorescence is

unmodulated. Optically sectioned images can be reconstructed by taking advantage of the difference in signal modulation [23, 24]. SI can also be used to down-modulate sample spatial frequencies and leads to super-resolution imaging [18, 20, 21, 46]. In this section, we describe how to build, align, and operate a widefield structured illumination microscope (SIM). Because super-resolution SIM (SR-SIM) has been reviewed extensively previously [46], we focus on an optical sectioning SIM (OS-SIM) implemented with a refined image reconstruction method for accurate structural and functional imaging of neurons and synapses in vivo [9].

3.2 Material and Equipment Figure 10 shows an example schematic diagram of a SIM setup. After passing through an acousto-optic tunable filter (AOTF; AA Opto-Electronic, AOTFnC-400.650-TN) (Fig. 11), an excitation laser beam is expanded to match the active area of a spatial light modulator. To produce sinusoidal illumination patterns at the focal plane, the laser light reflects off the spatial light modulator (SLM; Forth Dimension Displays Ltd., QXGA-3DM) displaying binary gratings. Two achromatic half-wave plates (HWP1 and HWP2; Bolder Vision Optik, BVO AHWP3) and a polarizing beam splitter (PBS; Thorlabs, PBS251) direct the laser to the SLM and maximize its diffraction efficiency off the SLM, by ensuring the right polarization direction. The diffracted light then transmits through the PBS and has its polarization further controlled by another HWP (HWP3; Bolder Vision Optik, BVO AHWP3) and a quarter-wave plate (QWP; Bolder Vision Optik, BVO AQWP3), both of which are mounted in a fast rotator (FR; Finger Lakes Instrumentation, A24201). A dichroic mirror (D1; Semrock, Di-R405/488/561/ 635-t3–25 36) reflects the illumination laser and transmits the emitted fluorescence, and an identical compensating dichroic mirror (D2, shown in Fig. 12) is used to minimize polarization scrambling (more details below). An objective lens (Nikon, CFI Apo LWD 25X, 1.1 NA and 2 mm WD) is used for both illumination and fluorescence collection. The emitted fluorescence is focused and imaged on a camera (Hamamatsu, Orca Flash 4.0). Focal lengths of all lenses (L1-L8) in the microscope: 150 mm, 125 mm, 400 mm, 400 mm, 175 mm, 300 mm, 85 mm, and 75 mm. 3.2.1 Optical Components

3.2.2 Fixed Mouse Brain Slices Preparation To demonstrate structural imaging with OS-SIM, we prepared brain slices from a Thy1-GFP line M transgenic mouse (The Jackson Laboratory, stock 007788). After being completely sedated with isoflurane (Piramal), the mouse was transcardially perfused first with phosphate buffered saline (PBS, Invitrogen) followed by 4% paraformaldehyde (PFA, Electron Microscopy Sciences). The mouse brain was dissected and immersed in 2% PFA and 15% sucrose in PBS solution overnight at 4 C. Then the immersion solution was replaced with 30% sucrose in PBS. After 24 h, the

Fig. 10 Detailed optical layout of a structured illumination microscope. (a) An amplitude mask in a rotational mount selectively transmitting first-order diffraction beams. (b) Structured illumination generated by two-beam interference at the sample plane. See Subheading 3.1 for detailed information of the optical components

Fig. <sup>11</sup> Laser beam multiplexing with single-edge laser dichroic beam splitters

Fig. 12 The polarization scrambling by a dichroic mirror (D1) can be compensated by another identical dichroic mirror (D2) properly oriented

mouse brain was sectioned on a microtome (Thermo Scientific™, Microm HM430) to 100-μm-thick slices, and then mounted on microscope slides to dry. After ~45 min, we placed cover glasses with mounting medium (Vectashield® Hardset™ Antifade mounting medium, H-1400) on top of the microscope slides with brain slices.

3.2.3 Drosophila Larvae Preparation To demonstrate functional imaging with OS-SIM, we used transgenic Drosophila third instar larvae. A GCaMP6f-based postsynaptically targeted genetically encoded calcium indicator was expressed in Drosophila larval muscle throughout development (genotype: w1118; OK6-Gal4/UAS-CpxRNAi (BDSC Line #42017); MHC-CD8-GCaMP6f-Sh/+). Larvae were dissected using a traditional semi-intact fillet preparation in HL3 solution (concentration in mM: 70 NaCl, 5 KCl, 0.45 CaCl2·2H2O, 20 MgCl2·6H2O, 10 NaHCO3, 5 trehalose, 115 sucrose, 5 HEPES, and with pH adjusted to 7.2) before imaging. During imaging, to maintain viability, the larval fillet was immersed in HL3 containing 1.5 mM CaCl2·2H2O and 25 mM MgCl2·6H2O.

3.3 Methods In this section, we provide instructions on how to successfully build a structured illumination microscope. Most instructions apply to both OS-SIM and SR-SIM. Throughout this section, we provide useful notes where SR-SIM differs from OS-SIM in implementation.

3.3.1 SIM Setup The illumination and emission light paths for a structured illumination microscope follow that of a standard widefield microscope, with added components/modules to generate and optimize the structured (e.g., sinusoidal fringe) illumination at the sample plane (Fig. 10). In the following subsections we discuss critical steps in generating SIM illumination.

Laser Beam Multiplexing, Shuttering, and Expansion Widefield fluorescence microscopes usually use continuous-wave (CW) lasers as excitation sources. In this section, we demonstrate the imaging of GFP (green fluorescence protein)-expressing samples; therefore, a single 488-nm CW laser is shown in Fig. 10. For multi-color imaging, multiple CW lasers with different wavelengths can be multiplexed using dichroic beam splitters at 45 angles of incidence. The output power of multiple CW lasers can be controlled with a single AOTF placed right after the combined laser beams. When combining multiple laser beams, it is important to ensure that they are all perfectly directed into the entrance of AOTF and follow the same light path to the sample plane. We recommend starting with a single CW laser (with the shortest wavelength, i.e., the laser closest to the AOTF) to build the microscope. Once its alignment is optimized, other CW lasers can be added to the light path. Each laser beam must have independent tip and tilt adjustments, which can be done by using two successive mirrors between the laser and the dichroic beam splitter. Beam expansion can be done by using either pairs of achromatic lenses or integrated beam expanders. Either way, we need to ensure that each laser beam is collimated after expansion, which could be qualitatively tested with a commercially available shearing interferometer (e.g., SI035 from Thorlabs) that consists of a wedged optical flat (mounted at 45 to the light path) and a diffuser on top. If the beam is collimated, the interference fringes produced by the reflections from the front and back surfaces appear parallel to the reference line on the diffuser. Some beam expanders have the sliding lens design, facilitating easy divergence adjustment with a collimation adjustment ring (e.g., GBE02-A from Thorlabs). One needs to rotate the adjustment ring and meanwhile observe the interference fringes on the diffuser until they are parallel to the reference line.

## Beam Modulation Module 1. Structured Illumination Generated by Two Beam Interferences

The structured illumination at the sample plane can be produced by two-beam interference. In our implementation, a spatial light modulator (SLM) is placed in conjugation to the sample plane (Fig. 10). To generate a sinusoidal illumination pattern, the SLM displays a binary grating pattern. The two first-order diffraction beams are selected to transmit through an amplitude mask (Fig. 10a) positioned at the focal plane of L1, and then imaged onto the objective back focal plane by a pair of lenses (L2–L3). The two beams exiting the objective interfere at the sample plane, producing sinusoidal patterns (Fig. 10b). It is recommended to use a flip mount for the amplitude mask, allowing an easy switch between uniform and structured illumination.

#### 2. Producing the Desired Illumination Patterns

There are a few important parameters to consider when choosing the sequence of gratings displayed on the SLM: the pattern period, pattern phase, pattern orientation, and the number of SI images, which are optimized for specific samples and applications.

For a binary grating displayed on the SLM with the period d, the diffraction angles θ<sup>m</sup> are:

$$
\sin\left(\theta\_m\right) = \frac{\mathfrak{m}\lambda}{d},
$$

where m is the diffraction order, and λ is the illumination wavelength.

 At the focal plane of L1 (focal length FL1), the distance between the 1-order diffraction beam focal spots, h<sup>1</sup> is:

$$h\_{\pm 1} = 2FL\_1 \tan \theta\_1 \approx 2FL\_1 \times \frac{\lambda}{d}.$$

The focal spots on the mask are then imaged onto the objective back focal plane by L2 and L3 (focal lengths FL<sup>2</sup> and FL3). H1, the distance between the two dots at the back focal plane, becomes:

$$H\_{\pm 1} = \mathfrak{L}F\_1 \times \frac{\lambda}{d} \times \frac{FL\_3}{FL\_2}.$$

The SI frequency/period at the sample plane is determined by the ratio of H<sup>1</sup> to the diameter of objective back pupil. A ratio equals 1 means the two beams are focused to the edge of the objective's back pupil (i.e., occupying the full numerical aperture of the objective), and then exit the objective at the largest-possible angles, generating a sinusoidal interference pattern of the highestpossible spatial frequency at the focal plane (i.e., diffraction-limited spatial frequency).

To change the phase or the orientation of the illumination pattern, we laterally shift or rotate the binary grating on the SLM, respectively. This results in the sinusoidal interference pattern at the focal plane shifting by a specific phase or rotating to a corresponding orientation.

For OS-SIM, the SI period should be selected based on the desired optical section thickness and the sample. In theory, the patterns that provide the maximal optical-sectioning strength (i.e., generate the thinnest optical sections) have spatial frequency that is half of the diffraction limit [23, 47]. In practice, when imaging a thick or densely labeled sample, the strong fluorescence background sometimes makes it difficult to detect the modulated signal in the focal plane if the optical section is too thin. In such cases, the SI spatial frequency should be empirically determined to optimize OS-SIM image quality. For SR-SIM, SI patterns with higher spatial frequency lead to higher lateral resolution. SI with spatial frequency at the diffraction limit, however, loses optical sectioning capability (because the resulting standing wave illumination maintains contrast even out of the focal plane). In thick samples, a compromise between optical sectioning strength and spatial resolution is required [8] for SR-SIM.

 The number of SI images varies with different reconstruction methods. For example, the OS-SIM implementation proposed by Neil. et al. [23] requires three SI images with the same orientation but equally spaced phases. For 2D SR-SIM [20], to achieve the most resolution improvement, SI images with multiple ( 3) orientations and phases ( 3) are typically used.

A polarizing beam splitter (PBS) is used to reflect the excitation beam toward the SLM and transmit the diffracted beam traveling away from the SLM (Fig. 10). As the PBS reflects the s-polarized component and transmits the p-polarized component of the incident light, an achromatic half-wave plate (HWP) is placed before the PBS to control excitation beam polarization. The HWP is rotated to minimize the transmitted power, thus maximizing the beam power delivered to the SLM. Our SLM has maximal diffraction efficiency for p-polarized light. Therefore, another HWP is placed between the SLM and the PBS. With a binary pattern displayed on the SLM, we rotate this HWP to minimize the power in the 0th-order diffraction beam.

High modulation contrast is essential for both OS-SIM and SR-SIM imaging. For OS-SIM, high contrast ensures strong modulation of in-focus signals, providing maximal optical sectioning. For SR-SIM, high contrast ensures large magnitude of high frequency components, supporting the extended spatial resolution. To maximize the illumination contrast, the two interfering beams should have s-polarization at the sample plane. In principle, one can use a HWP set at an optimized angle for the grating orientation for OS-SIM, or mount a HWP in a fast rotator and maintain s-polarization for all illumination orientations used for SR-SIM. However, in practice the s-polarization state may not be maintained when the illumination beams reach the sample plane. This is because optical components in the illumination light path may alter beam polarization. For example, a dichroic mirror used to separate excitation and emission light (Fig. 10, D1) may reflect and transmit the p- and s-polarization components differently, and therefore scramble the polarization of the illumination and make an originally linearly polarized light to become elliptically polarized.

Maximizing Diffraction Efficiency and Pattern Contrast at the Sample Plane

To solve this problem, we can use the combination of a HWP and a QWP mounted on fast rotators (HWP3 and QWP on FRs, Fig. 10) to compensate for the ellipticity altering effect and provide desired polarization at the imaging plane [48]. For each pattern orientation (i.e., each desired polarization), we need to find the rotational angles of the two waveplates that maximize the contrast at the imaging plane. In practice, we take a series of SI images while rotating the two waveplates, until image contrast is maximized. For OS-SIM, the angles only need to be optimized for a single orientation. For SR-SIM, angle optimization is required for all SI orientations and the waveplates must be rotated during image acquisition to ensure maximal contrast for all SI orientations. In our system, this is achieved by mounting the waveplates in two fast rotators synchronized with SLM display and imaging acquisition.

Alternatively, we can utilize an additional, identical dichroic mirror oriented at a specific angle relative to the first dichroic mirror [49] (Fig. 12), with the reflections upon the two dichroic mirrors interchanging the s- and p-polarization components and therefore canceling out the polarization scrambling by the two dichroic mirrors.

Fine Alignment SIM is very sensitive to misalignment, and here we provide two useful methods to ensure perfect alignment. With both methods, fine tip/tilt adjustment is made with two successive mirrors before the SLM.

> The first method utilizes the back-reflection pattern from the objective lens. We first move the mask out of the light path (or flip down if a flip mount is used). We then let the SLM display a flat pattern so that it acts as a mirror and allows standard widefield illumination. To achieve perfect alignment with the illumination beam entering the objective center along its optical axis, we hold a piece of white paper with a hole between L3 (Fig. 10) and the objective, letting the illumination beam transmit through the hole and reach the objective lens. We then can observe the light reflected from the lens components inside the objective onto the white paper. In the case of perfect alignment, the pattern appears as concentric rings (Fig. 13a); when it is misaligned, the pattern appears off-centered and scrambled (Fig. 13b).

> With the second method, we directly observe the two interfering beams below the objective lens. For this, we let the SLM display a grating pattern and put the mask back in path. Here we display on the SLM a fine grating pattern so that the two diffraction orders enter the objective lens close to its edge (typically at 80% of the NA; see Subheading 3.3.1.2 for detailed information). In the case of perfect alignment, the two beams below the objective appear symmetric in shape and with similar brightness (Fig. 14a). Choosing a fine pattern makes any clipping of the two beams easy to detect. If the light path is misaligned, the two beams appear asymmetric and

Fig. 13 Back-reflected illumination light observed before the objective. (a) Perfect alignment pattern. (b) Misalignment pattern

> with different brightness (Fig. 14b). This procedure should be repeated for different orientations to ensure perfect alignment along all directions.

3.3.2 SIM Detection Path In our system, the emitted fluorescence is collected by the same objective and focused onto a camera for imaging (Fig. 10) with achromatic lenses (L4–L8). Magnification from the sample plane to the camera should ensure Nyquist sampling for the desired resolution. For SR-SIM that can double the diffraction-limited resolution, the pixel size should be smaller than a quarter of the diffraction-limited widefield resolution.

3.3.3 Optical-Sectioning Widefield Imaging and Its Application Examples In this section, we introduce how structured illumination is utilized for optical sectioning and describe a refined OS-SIM reconstruction method optimized for in vivo imaging.

Refined OS-SIM Reconstruction Method The basic idea of OS-SIM is that only in-focus information can be effectively modulated. Figure 15 shows how structured illumination only preserves its contrast for structures in the focal plane (blue arrows). When a structure is out-of-focus (orange arrows), its fluorescence signal is not or only weakly modulated. Taking advantage of this difference in signal modulation, we could computationally reject out-of-focus background and retrieve in-focus information.

One popular OS-SIM implementation was proposed by Neil et al. [23], which requires three SI images with equally spaced

Fig. 14 Transmitted illumination light observed below objective. (a) Perfect alignment. (b) Misalignment

phases 0, 120, and 240. This "basic OS-SIM" method reconstructs an optically sectioned image using Eq. (1):

$$\mathbf{I}\_{\text{basic SIM}} = \sqrt{(\mathbf{I}\_0 - \mathbf{I}\_1)^2 + (\mathbf{I}\_1 - \mathbf{I}\_2)^2 + (\mathbf{I}\_2 - \mathbf{I}\_0)^2},\qquad(1)$$

where I0, I1, I2 is the intensity of each image at 0, 120, and 240 phases, respectively. The pairwise subtractions discard the out-offocus (un-modulated) signal, and the summing removes the non-uniform illumination patterns. As shown in Fig. 16, compared with the widefield image (Fig. 16a), basic SIM (Fig. 16b) effectively suppressed the background fluorescence. However, the signal-tonoise ratio (SNR) is suboptimal because of the positive bias created by the non-linear operation, making it difficult to resolve fine structures such as dendritic spines (Fig. 16b inset). We proposed a method [9] to suppress high-frequency noise while maintaining

Fig. 15 Structured illumination only modulates in-focus signals effectively. Images at two sample planes with different structures in focus. Blue arrows: in-focus structures, orange arrows: out-of-focus structures. Scale bar: 3 <sup>μ</sup>m

optical sectioning: we first low-pass (LP) filtered the basic SIM image, high-pass (HP) filtered the noise-averaged (free of positive bias) widefield image (IWF ¼ [I0 + I1 + I2]/3), and then carried out a weighted summation of the two in order to reconstruct the final optical section using Eq. (2):

$$\mathbf{I}\_{\rm SIM} = \mathrm{LP}(\mathbf{I}\_{\rm basic\ SINR}) + \boldsymbol{\alpha} \cdot \mathrm{HP}(\mathbf{I}\_{\rm WF}), \tag{2}$$

With this refined algorithm, we observed a substantial improvement in SNR, which helped to better reveal the morphology of dendrites and dendritic spines (Fig. 16c).

We imaged the same dendritic structures using a two-photon fluorescence microscope (Fig. 16d), the most popular OS imaging method for brain tissue. Comparing the images taken with OS-SIM and two-photon, we found that they provided comparable optical sectioning capability. In addition, one-photon excitation of OS-SIM led to higher resolution, resulting in crisper and sharper OS-SIM dendritic spine images.

Fig. 16 Widefield, basic SIM, refined OS-SIM, and two-photon fluorescence images of Thy1-GFP line M brain slices. (a–d) Maximum intensity projections (MIPs) of 8-μm-thick widefield (WF), basic SIM, refined OS-SIM, and 2-photon image stacks (0.1 <sup>μ</sup>m Z step, 21–29 <sup>μ</sup>m depth, 440 396 pixels at 86 nm pixel size), respectively. Insets are single optical sections. Scale bar: 5 <sup>μ</sup>m; insets: 1 <sup>μ</sup>m

Fast Functional Imaging Using OS-SIM

Applying OS-SIM to functional imaging, we demonstrated the capability of OS-SIM in capturing in vivo calcium events at high frame rate. We performed functional imaging at Drosophila larval neuro-muscular junctions (NMJs), where the muscle was labeled with post-synaptically targeted GCaMP6f-based genetically encoded calcium indicator [50].

 Similar to the brain slice data, OS-SIM provided excellent optical sectioning, resulting in images with much higher contrast (Figs. 17a–c). We recorded the calcium activity at the NMJs at 25 Hz OS-SIM frame rate (75 Hz for raw image frames). By calculating fluorescence change ΔF/F from eight regions of interest, we compared the sensitivity of widefield and OS-SIM imaging in reporting calcium activity (Fig. 17d). The suppression of the outof-focus fluorescence background by OS-SIM gave rise to a ~8

Fig. 17 In vivo functional imaging of quantal releases at the neuro-muscular junctions (NMJs) of a Drosophila larva with OS-SIM. (a–b) Averages of widefield (WF) and OS-SIM image sequences (frames without calcium activity) of NMJs (at a depth of 20 <sup>μ</sup>m, 492 492 pixels at 86 nm pixel size). Scale bar: 5 <sup>μ</sup>m; insets: 2 <sup>μ</sup>m. (c) Lateral line profiles across the structure in the insets (b, along red dashed line). (d) Spontaneous calcium transients from 8 regions of interests (8 s of recording, orange circles in <sup>a</sup>). Widefield transients were increased by 8 times for better visualization. (e) Averaged calcium transients over 5 events (black asterisks in <sup>d</sup>) measured with widefield and OS-SIM

larger ΔF/F than widefield imaging (Fig. 17e). Furthermore, without the contribution of the often unevenly distributed out-of-focus fluorescence, OS-SIM allowed accurate measurement of the amplitudes of calcium transients and quantitative comparison of in vivo activity in different structures.

Parameter Selection in Image Reconstruction For optimal image reconstruction, parameters should be carefully selected based on the imaged sample as well as the desired sectioning strength. Theoretically, the maximum sectioning strength is obtained when using an illumination period twice the diffraction limited resolution [47]. In practice, there exists a tradeoff between the optical sectioning strength v.s. modulation depth and signal-tonoise ratio. As a result, when imaging a sample with strong background fluorescence, a larger illumination period should be used so that the modulated signal can be more easily detected. When extracting low- and high-spatial frequency information from the basic SIM and widefield images, the crossover frequency, σ, was chosen to balance artifact suppression and sectioning strength. Larger σ means that we take more information from the basic SIM image, while small σ means more information from the widefield image, which typically has higher SNR than the basic SIM image. Thus, when imaging samples with higher SNR such as fixed brain slices, we used larger σ to better exploit the optical sectioning capability from the basic SIM image; when imaging noisy samples, we used smaller σ to sacrifice optical sectioning for better image quality. The scaling factor, α, weights the widefield image to ensure continuity in the Fourier domain. The value of α was determined by the modulation depth, which can be either precisely calculated using the correlation-based algorithm [51] or empirically estimated by final image quality.

Other Optical Sectioning Reconstruction Methods We described and demonstrated our refined OS-SIM method for in vivo structural and functional imaging in previous sections. There are other optical sectioning reconstruction methods as well as structured illumination strategies. For example, differential illumination focal filtering (DIFF) microscopy [25], a variant of HiLo, reconstructs one optical section from two images with structured and complementary illumination patterns. HiLo microscopy [24, 52], another SIM method, reconstructs one optical section from one SI image and one uniform illumination image. HiLo has faster imaging speed and is less sensitive to illumination distortion and sample motion. For systems that cannot provide precise SI translation (e.g., by an SLM), HiLo is a better option. The OS-SIM system described here can also implement these alternative SIM methods. A method should be chosen after considering the application, implementation difficulty, and budget.

Other Considerations for In Vivo Imaging In addition to the often low signal-to-noise ratio, two other issues of applying OS-SIM to in vivo imaging are sample-induced wavefront distortion and motion-induced reconstruction artifacts. Previously [9], we demonstrated that aberration correction by adaptive optics is essential for OS-SIM in both structural and functional imaging of in vivo structures. For example, the imaging of the Drosophila larva suffered from spherical aberrations coming from the muscle layers above the focal plane and the high sucrose concentration immersion saline, resulting in severe reconstruction artifacts and abnormal calcium dynamics. All our presented results in this section were aberration corrected. For motion artifacts, we used a phase-corrected algorithm to remove motion-induced reconstruction artifacts [9], which is especially important for in vivo imaging.

> Since the demonstrated 25 Hz rate is more than sufficient for calcium imaging, we did not push for the highest imaging speed that our camera is capable of. The imaging speed of OS-SIM is theoretically limited by the frame rate of the camera, with the maximum full chip (2048 2048) frame rate at 100 Hz. The frame rate can be increased by simply reducing the line number of readout. The camera in our system (Hamamatsu, Orca Flash 4.0) can operate at 400 Hz at 512 lines and 800 Hz at 256 lines. By implementing an interleaving OS-SIM reconstruction [53], the OS-SIM frame rate equals that of the raw image frame rate. Thus, for an imaging area of 256 2048 pixels, the 800 Hz frame rate makes voltage imaging possible with OS-SIM.

#### 4 Discussion

Both Bessel focus scanning two-photon fluorescence microscopy and optical-sectioning widefield microscopy can perform highspeed imaging of neural activities at synaptic resolution. OS-SIM can have faster frame rates; however, its application for deep imaging of optically opaque samples is challenging due to tissue scattering. Volumetric imaging with OS-SIM requires physical movement of the objective or sample, thus maybe slower than Bessel focus scanning two-photon fluorescence microscopy, which allows videorate volumetric recording of neurons at hundreds of microns deep into the highly scattering mouse brain. One should choose from the two methods based on sample and application, for example, whether tissue scattering is a concern and whether activity information over large volume is required. For thin or transparent samples, OS-SIM enables functional imaging with synaptic resolution at hundreds of hertz; for opaque samples, Bessel focus scanning two-photon fluorescence microscopy would be the method of choice for volumetric activity imaging. Furthermore, both methods can be combined with other cutting-edge techniques and labeling strategies, e.g., adaptive optics [13, 54–56] and near-infrared sensors [57–59], to further enhance the imaging resolution and depth, respectively.

Recent advances in optogenetic actuators and microscopy techniques to activate them have allowed all-optical manipulation of neuronal activity at single-cell resolution (see also Chap. 3 and 11 of this book). They can be combined with both microscopy methods described here to realize all optical writing and reading of neuronal activity. The high volumetric imaging throughput of Bessel focusing scanning two-photon fluorescence microscopy makes it particularly suited to study the effect of selectively activating a subpopulation of neurons on the activity dynamics of extended 3D networks [38]. If large imaging depths or high volumetric rates are not required, OS-SIM designed to have a large FOV can be combined with optogenetic stimulation to monitor activity over a mesoscopic area [60].

#### References


(vTwINS). Nat Methods 14:420–426. https://doi.org/10.1038/nmeth.4226


an acoustic optical deflector. Biophys J 83: 2292–2299. https://doi.org/10.1016/ S0006-3495(02)73989-1


Cell Neurosci 8:139. https://doi.org/10. 3389/fncel.2014.00139


Opt 15:1–7. https://doi.org/10.1117/1. 3324890


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/),

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Optical and Analytical Methods to Visualize and Manipulate Cortical Ensembles and Behavior

### Luis Carrillo-Reid, Weijian Yang, and Rafael Yuste

#### Abstract

The development of all-optical techniques and analytical tools to visualize and manipulate the activity of identified neuronal ensembles enables the characterization of causal relations between neuronal activity and behavioral states. In this chapter, we review the implementation of simultaneous two-photon imaging and holographic optogenetics in conjunction with population analytical tools to identify and reactivate neuronal ensembles to control a visual-guided behavior.

Key words Two-photon imaging, Two-photon optogenetics, Holographic microscope, Neuronal ensembles, Pattern completion, Population vectors, Control of behavior

#### 1 Introduction

One of the main questions in modern neuroscience is how the activity of identified neuronal populations generates behavioral and mental states [1]. Recent advances in optical techniques that allow the simultaneous recording and manipulation of neurons with single-cell resolution [2, 3] combined with population analytical tools [4–8] suggest that neuronal ensembles are units of brain computation [9, 10]. Neuronal ensembles are groups of neurons with coordinated activity that may underlie sensations, perceptions, emotions, and memories. In order to prove causality between the activity of specific neuronal ensembles and learned behaviors, it is becoming clear that the ability to manipulate and record hundreds of neurons simultaneously needs to be guided by analytical tools that allow the identification of neurons that could have a deterministic impact in brain states. Holographic two-photon imaging [11], two-photon optogenetics [12], and analytical tools [13] offer the possibility to target neuronal ensembles to control behavioral states. The demonstration of the causal relation between neuronal ensemble activity and behavior has been achieved recently in

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_11, © The Author(s) 2023

different brain areas [4, 14–18], paving the pathway to design more sophisticated experiments to understand complex mental states in health and disease [19].

The optimal design of all-optical experiments to control learned behaviors with single-cell resolution requires the simultaneous reading and writing of neuronal activity. This could be achieved in many ways [20], but in this chapter we will focus mainly on scanning and parallel optical techniques guided by analytical tools. We describe the implementation of scanning two-photon imaging and parallel two-photon optogenetics using a spatial light modulator (SLM). We also describe the main concepts necessary to identify and target neurons with pattern completion capability [6, 21] that can recall neuronal ensembles related to a visually guided behavior [4].

#### 2 Implementation of Simultaneous Two-Photon Imaging and Two-Photon Optogenetics

2.1.1 Light-Sensitive Sensors and Actuators of Neuronal Activity

2.1 Background Optical methods are powerful to record and manipulate neuronal activity at single-cell resolution across a large population—a key requirement to investigate neuronal ensembles. Comparing with electrophysiology, optical methods can simultaneously sample and target a large group of neurons with high spatial specificity in a noninvasive manner and can be conveniently implemented for in vivo studies. The cornerstone of optical methods is the optical microscope, which uses light to record and manipulate the neurons that are illuminated. As the neurons in the brain do not typically respond to light, they can be loaded or transfected with lightsensitive sensors and actuators that can respectively report and manipulate their neuronal activity upon light illumination. Using these neuronal activity sensors and actuators, an optical microscope can simultaneously read and write neuronal activity across a large field of view with high spatial specificity.

> Calcium [22–26] and voltage indicators [27, 28] are two commonly used sensors for neuronal activity. These indicators are embedded with fluorophores, which can absorb the illumination or excitation light, and emit fluorescence. The efficiency of this light-absorption, fluorescence-emission process is modulated by the intracellular calcium concentration and membrane potential respectively in calcium indicators and voltage indicators (See also Chaps. 1 and 2). Thus, the neuronal activity can be deduced from the time-lapse recording of the fluorescence in individual cells. The commonly used calcium indicators include GCaMP6 [26], jGCaMP7 [29], jRGECO [30], etc. Compared with calcium indicators, voltage indicators are less mature with smaller signal-tonoise ratio and prone to photo-bleach though much progress has been made in the past years [31] (Chap. 2). The counterparts of

these activity sensors are actuators, which could be chemical-based neurotransmitter cage-compounds [32, 33] or opsins [34, 35] serving as light-sensitive ion channels. When absorbing the light, the cage-compounds could release the neurotransmitters (uncaging) and the opsins could open or close the ion channels (optogenetics) and thus control the membrane potential. This mechanism thus allows light control of neuronal activity in individual cells.

2.1.2 One-Photon and Multiphoton Excitation Depending on how light is interacting with the light-sensitive sensors and actuators, optical methods can be classified into two categories: one-photon and multiphoton (i.e., two-photon [36, 37] or three-photon [38, 39]). In one-photon case, the interaction is linear. The fluorescence emission rate of the sensors and the actuation strength of the actuators are proportional to the intensity of the excitation light before saturation. While one-photon excitation is straightforward, it lacks spatial specificity in 3D. As the excitation would form a double cone-shape pattern along the axial direction (Fig. 1), all neurons with the sensors or actuators within the double cone may absorb the light. It is thus very challenging to just excite a single neuron in a 3D volume. Furthermore, in terms of imaging, the out-of-focus fluorescence emission typically becomes background on the image captured at the focal plane and thus reduces the image signal-to-noise ratio. As the light-absorption and fluorescence-emission cycles create phototoxicity, one-photon excitation pays a high price to image the focal plane by creating phototoxicity within the entire double cone volume. Multiphoton excitation can greatly overcome these limitations, as the interaction between the light and the sensors or actuators is nonlinear. In the two-photon case, the excitation effect is proportional to the intensity square of the excitation light before saturation, leading to a strong gradient of the excitation effect along the optical axis in the double cone. Thus, it is feasible to control the incident light so only the light intensity at the focal point (i.e., the tip of the double cone) is strong enough to excite the neuronal sensors or actuators (Fig. 1). This greatly improves the spatial specificity or resolution and reduces the phototoxicity in out-of-focus region. Another advantage of two-photon excitation is that the excitation light has a longer wavelength, which reduces the light scattering effect in scattering brain tissues. Thus two-photon excitation can penetrate much deeper in the brain compared with one-photon excitation. The drawback of two-photon excitation is that much more laser power is required due to a lower efficiency of multiphoton absorption. As the excitation light eventually turns into heat, more heat will be generated inside the brain. This is typically not a concern as in a typical imaging experiment, the laser power used is less than the brain damage threshold [40], even at very deep layers. In the past two

Fig. 1 Comparison between one-photon and two-photon excitation. In one-photon excitation (a), a double cone is formed by focusing the visible excitation light (indicated by blue color); fluorophores (green) in the entire double cone could be excited. In two-photon excitation (b), while a double cone is still formed by the focus of infrared excitation light (indicated by red color), the fluorescence generation (green) is localized at the vicinity of the focal region. Reprinted with permission from [101], Springer Nature. Experimental illustration of the one-photon and two-photon excitation (0.16 NA for both cases) is shown in the bottom panel. (Reprinted with permission from Ref. [102], Springer Nature)

decades, two-photon microscopes have become one of the workhorses in neuroscience. Recently, three-photon excitation has been successfully demonstrated for both calcium imaging [ and optogenetics [41]. Comparing with two-photon excitation, three-39]

photon excitation has an even higher nonlinearity and could further increase spatial specificity and penetration depth, though the laser power should be managed so it will not exceed the brain damage threshold.

2.1.3 Basic Setup of Multiphoton Microscope Since the multiphoton absorption rate is generally low, and it has a nonlinear relationship with the light intensity, a femtosecond laser is required for multiphoton excitation. The femtosecond laser delivers a periodic pulse train, typically with a repetition rate of 1~80 MHz. The temporal width of a pulse is below 300 fs, yielding a very high instantaneous peak power and thus a high multiphoton absorption rate. Unlike one-photon excitation where a large area or volume of sample could be simultaneously illuminated and their fluorescence could be detected through a camera in the case of imaging, the illumination of multiphoton light is typically through a rapid scanning of the laser beam on the sample. In multiphoton imaging, a single pixel detector (versus pixel array in a camera) is used to record the emitted fluorescence (see also Chap. 2). By correlating the temporal signal from the detector and the scanning trajectory, the image can be built. The microscope setup (Fig. 2a) typically includes a raster lateral (xy) scanning system composed of a galvanometer scanner and a resonant scanner (30~60 Hz frame rate), or two galvanometer scanners (4~10 Hz frame rate). For volumetric imaging, an axial (z) scanning system is implemented by adding a piezo-electric controller on the objective lens or inserting axial focusing devices, such as electrically tunable lens [42], spatial light modulator [43], or remote focusing unit [44–46] (~ ms focus switching time). Different axial planes are sequentially scanned (Fig. 2b). While this configuration of lateral and axial scanning is straightforward, it may not be the most efficient as it blindly samples the brain tissue. There could be a large volume of "empty" extracellular space without neurons, and among all the neurons, typically only a subset of them are labeled with the calcium indicators. Random access scanning techniques [47–51] can overcome this issue. The laser focus spot can jump rapidly between different regions in the sample in 3D (<20 μs transition time between different regions) (Fig. 2c). This is enabled by acoustooptic deflectors (AODs). The challenges of this technique are to overcome the spatial and temporal distortion of the ultrashort laser pulses, caused by the angular dispersion from the AODs' phase grating and the group delay dispersion of the AOD crystals, respectively. These limit the imaging field of view. While these distortions can be compensated [51], the optical setup is complex. AODs also have a limitation of high insertion loss. Recently, many advanced scanning mechanisms including beam multiplexing [43, 52–55] are proposed and demonstrated to increase the imaging throughput. For extensive reviews, see [56–59].

Fig. 2 Basic setup of two-photon laser scanning microscope. (a) Schematics of a typical two-photon laser scanning microscope. The Pockels cell is used to modulate the laser light intensity. The xy-scan system and z-scan system are used to scan the laser focal spot laterally and axially in the sample, respectively. The dichroic mirror transmits the infrared excitation light and reflects the visible fluorescent light to the photomultiplier tube (PMT). (b) Schematics of the typical scanning trajectory in volumetric imaging. (c) Schematics of an exemplary scanning trajectory using random access scanners. The green regions indicate neurons of interest. (Reprinted with permission from Ref. [56], Springer Nature)

The setup of multiphoton photostimulation is similar as multiphoton imaging, but without the detection module. The laser spot on the sample is typically controlled by a pair of galvanometer mirrors, and raster or spirally scanned across each target neuron (1~100 ms) before jumping to another neuron. Alternatively, the combination of beam shaping and temporal focusing techniques [60–64] can be implemented so that a disk pattern with tight axial confinement can be projected to the entire neuron. The scanners no longer perform raster or spiral scanning, but direct the disk excitation pattern to different neurons sequentially [3]. Using the holographic approach, illumination patterns could be simultaneously projected to multiple targeted neurons, so the scanners are no longer required [62–66]. We will discuss this topic in depth, as well as holographic illumination technique that can simultaneously photostimulate multiple neurons, in the next section.

In the following, we focus on our implementation of a microscope that can simultaneously perform calcium imaging to record neuronal activity and optogenetics to manipulate neuronal activity.

#### 2.2 Simultaneous Two-Photon Imaging and Two-Photon Optogenetics

Consideration

An all-optical method [20] refers to simultaneously recording and manipulating the neuronal activity through light. Here, we describe the combination of two-photon calcium imaging and two-photon optogenetics in a microscope. To flexibly image and manipulate neuronal ensembles, it is desired to have the following features in the microscope: 2.2.1 Overall


To simultaneously, while independently, perform calcium imaging and optogenetics, the microscope should have two independent beam paths, respectively, for calcium imaging and optogenetics (Fig. 3a) [2, 3, 66–69]. These two beam paths then merge together before the light is directed into the brain tissue. Each beam path should be implemented with an individual femtosecond laser with a different wavelength, which corresponds to the central excitation spectrum of the calcium indicators and opsins. It is important that they have a different excitation spectrum, so as to prevent crosstalk between imaging and photostimulation (see also Chaps. 1, 2, and 5). In other words, when performing calcium imaging, the imaging laser should not excite the opsin; when performing optogenetics, the photostimulation laser should not excite the fluorescence of the calcium indicator. There are two common options to choose the calcium indicator and opsin pair: green calcium indicators such as GCaMP [26, 29] with red-shifted opsins such as C1V1 [70], or red-shifted calcium indicators such as jRGECO [30] with blue-activated opsins such as ChR2 [34]. While both of these pairs have minimal overlap in the excitation spectrum, it is critical to keep the imaging laser power low and perform control experiments to ensure that the imaging laser does not increase the spiking rate of the neurons (see Note 1). In case the photostimulation laser creates fluorescence artifacts, the imaging data should be processed to eliminate the artifacts. For a successful experiment, it is also important to have a high co-expression rate of the calcium indicators and opsins on a large neuronal population. The cooperation of both constructs into a single virus could be a promising approach [14, 71]. An example of co-expression of GCaMP6s and C1V1 using two different viruses is shown in Fig. 3c.

Fig. 3 Simultaneous two-photon calcium imaging and two-photon holographic optogenetics. (a) Two-photon microscope setup with two independent beam paths for calcium imaging and photostimulation. An electrically tunable lens is equipped in the imaging path so the focal plane can be rapidly switched for high-speed volumetric imaging. The dichroic mirror 3 is used to spectrally separate the fluorescence into green and red channel. In the photostimulation path, a half-wave plate is placed before the spatial light modulator to align the polarization of the laser to the active axis of the spatial light modulator. The spatial light modulator then creates a hologram in the sample to photostimulate the regions of interest (e.g., a group of neurons). A pair of relay lens (4f system) is used to transfer the light field to a set of galvanometer mirrors which can spirally scan the holographic pattern. At the intermediate plane of this pair of relay lens, a zeroth-order beam block is used to block the residue light that is not modulated by the spatial light modulator. The imaging laser and photostimulation laser have different wavelengths, and they are combined through the dichroic mirror 1 before being delivered to the sample through the scan lens, tube lens, and objective lens. HWP half-wave plate, PMT photomultiplier tube. (b) Schematics for simultaneous volumetric calcium imaging and 3D holographic patterned photostimulation in mouse cortex. (c) An exemplary field of view showing neurons co-expressing GCaMP6s (green) and C1V1-mCherry (magenta). (Reprinted and adapted from Ref. [69])

> To characterize neuronal ensemble activity, it is necessary to image a large population of neurons, typically on the order of hundreds. For visual cortex, a minimal field of view of <sup>200</sup> <sup>200</sup> <sup>μ</sup>m2 is necessary, though <sup>400</sup> <sup>400</sup> <sup>~</sup> <sup>600</sup> <sup>600</sup> <sup>μ</sup>m<sup>2</sup> would be more desirable. Using low magnification objective lens, the field of view can go above 1 mm2 . Furthermore, volumetric imaging can be performed to image neuronal ensembles in 3D and across different functional

layers (Fig. 3a, b). Due to the sequential scanning nature of two-photon microscopy, there is a tradeoff between the field of view and spatiotemporal resolution. Thus, while it is desirable to increase the field of view so as to image more neurons, one should maintain cellular resolution with a good signal-to-noise ratio (i.e., able to distinguish the calcium transients), as well as a frame rate of at least 4–10 Hz. In the setup shown in Fig. 3a, we use a galvanometer and a resonant scanner pair for lateral scanning, and an electrically tunable lens for fast focal plane switching. Volumetric imaging of 3 planes could be achieved at a volume rate >6 Hz, with a field of view ~500 <sup>500</sup> <sup>μ</sup>m<sup>2</sup> per plane [69].

A typical experiment starts with two-photon calcium imaging on the brain tissue. The animal is head-fixed and stabilized on the microscope stage and could perform behavior tasks. The spatial footprint as well as the neuronal activity of individual neurons within the image field of view can then be extracted from the recording. The neuronal ensemble can be identified from the activity pattern (see Subheading 3). Depending on the applications, the users may choose a specific population of neurons to perform optogenetics experiments. It is thus desirable to perform photostimulation on multiple neurons simultaneously. Two-photon holographic illumination enables photostimulating a group of user-selected neurons [2, 12, 14, 66, 68, 69, 72, 73]. We discuss the topic in the following section.

2.2.2 Holographic Illumination Holographic illumination refers to projecting to the brain tissue a computer-generated holographic pattern. Neurons falling within this pattern can then be photostimulated. Computer-generated hologram is a light field of an arbitrary shape in 3D in an imaging volume, and this shape can be dynamically controlled and changed through a spatial light modulator (Fig. 4a). In the simplest imaging system, which contains a single lens, to generate a hologram in the imaging space (i.e., around the front focal plane of the lens), one can spatially modulate the light field at the back focal plane of the lens. This light field then propagates through the lens, and coherently interferes at the front focal plane and forms the holographic pattern (Fig. 4a). As the light field at the front focal plane and back focal plane of a lens forms a Fourier transform relationship, one can calculate how the light field should be spatially modulated at the back focal plane based on the desired pattern at the front focal plane (Fig. 4b, c). The typical spatial light modulator is based on liquid crystals and is configured to only modulate the phase of the light field. A Gerchberg-Saxton algorithm (Fig. 5) [74], which is an iterative approach, can be used to calculate the phase pattern on the spatial light modulator, given the amplitude of the desired pattern in the imaging space and the amplitude of the light field incident on the spatial light modulator. If the holographic pattern only contains a group of points with different weights, a

Fig. 4 Computer-generated hologram. (a) Principle of computer-generated hologram. The collimated laser beam is incident onto the spatial light modulator (SLM), which spatially modulates the wavefront of the light. The light field then propagates through a lens and forms the desired 3D image in the imaging domain through interference. f, focal length of the lens. (b) By modulating the spatial phase profile at the back focal plane of the object lens, different focal spot patterns can be formed in the imaging domain. (Reprinted with permission from Ref. [103], Elsevier). (c) Example of a two-photon SLM hologram. The four panels illustrate the binary target image, the spatial light modulator phase hologram generated by Gerchberg-Saxton algorithm, the squared image (to mimic two-photon excitation) of the projected pattern back calculated from the phase hologram, and the experimentally measured two-photon fluorescence image generated by the SLM. A stylized picture of Cajal is used as a target image. (Reprinted from Ref. [11], Frontiers)

superposition algorithm [55, 75], which is essentially a single iteration of the Gerchberg-Saxton algorithm, can be used to calculate the phase pattern (Table 1).

In two-photon microscopes, we could not directly modulate the phase at the back focal plane of the objective lens, as the back focal plane is typically inside the objective lens housing. Relay lens pairs (4f system) can create the conjugate planes of the back focal plane of the objective lens, and the spatial light modulator can be

Fig. 5 Gerchberg-Saxton algorithm. This algorithm is used to calculate the phase hologram <sup>φ</sup>s for the spatial light modulator (SLM) based on the amplitude of the target pattern At in the imaging space and the amplitude of the incident light field As onto the SLM

#### Table 1

#### Analytical expression of the phase hologram and the Zernike polynomials and Zernike coefficients in superposition algorithm


n refractive index of media between the objective and sample, k the wavenumber, z the axial shift of the focus plane in the sample, u, v coordinates on the spatial light modulator phase mask, nsinα the numerical aperture (NA) of the objective

> placed in one of those conjugate planes (Fig. 3a). Furthermore, as there is residue light that is not modulated by the spatial light modulator (termed zeroth order beam), it will create a strong focus at the imaging space (see Note 2). The relay lens pairs help to resolve this issue as a small beam block can be placed at the intermediate plane (which is conjugate to the focal plane at the imaging space) to block the zeroth order beam (Fig. 3a). Using the holographic illumination, various patterns can be generated in the imaging space (Fig. 4b, c).

Holographic illumination essentially spatially multiplexes the excitation beams, and allows multiple neurons being photostimulated simultaneously. However, this comes with an increase of laser power on the brain tissue. To alleviate this issue, a laser with a lower repetition rate can be used. When the pulse repetition rate is reduced, the energy in each laser pulse is increased, while keeping the overall average power the same. As the two-photon excitation efficiency is proportional to the square of the laser peak power, the increase of excitation efficiency due to the increase in pulse energy outweighs its reduction due to the reduced number of pulses per unit time. Thus, using a laser with lower repetition rate, a higher overall excitation efficiency can be achieved while keeping the same average power. In other words, to achieve the same excitation efficiency, the average laser power can be reduced. In the holographic photostimulation system described, we used a laser with a repetition rate of 1 MHz [69]. This reduces the overall laser power by 80 times compared with a commonly used 80 MHz femtosecond laser.

Due to the chromatic dispersion and spatial discretization of the SLM pixels, the SLM has a spatially varying diffraction efficiency [75]. The diffraction efficiency drops as the deflection angle increases, limiting the addressable 3D field of view of SLM. To alleviate this effect, the diffraction efficiency can be first measured, and then compensated in the Gerchberg-Saxton algorithm or through the weight factor Ai (Table 1) in the superposition algorithm [55, 75]. The laser intensity across the field of view can then be made uniform. By using an XY galvanometer set to provide a lateral offset to the centroid of the SLM's addressable field of view, the effective lateral field of view can be further extended. While neurons across this enlarged field of view cannot be targeted simultaneously, groups of neurons located at different sub-fields can be targeted by switching the offset of the XY galvanometer and the phase pattern on the SLM [76].

2.2.3 Spiral Scan Versus Scanless Approach in Holographic Photostimulation As mentioned in Subheading 2.1, photostimulation can be performed by raster or spirally scanning the laser focal spot across the neuron, or by projecting a disk pattern which matches the morphology of the neuron so the entire neuron can be stimulated at once. The same applies in holographic photostimulation. In the first approach, a group of focal spots are created in the hologram, and each focal spot lies on the centroid of the individual targeted neuron [2, 12, 69]. A set of galvanometer scanners then spirally scan the entire group of focal points (with certain repetitions), so each focal spot spirally scans across the corresponding targeted neuron. In the second approach, a group of disk patterns are created in the hologram and projected to the brain tissue [72, 77, 78]. Each disk spatially overlaps with a targeted neuron. Since the hologram already generates multiple disks for all the target neurons, this method can work without scanners, and thus this is a scanless approach. In two-photon excitation, these two approaches have their own advantages and limitations. The spiral scan approach spatially concentrates the laser intensity into focal spots; as two-photon excitation effect is proportional to light intensity square, this approach has a high excitation efficiency. As the entire neuron is not photostimulated at the same time, to reach the threshold of evoking action potentials, it relies on the accumulation of the excitation effect along the spiral trajectory. The decay constant of the opsin kinetics (tau-off) should thus be longer than the duration of each spiral (i.e., the opsin channels can stay open during the one single spiral scan, which could be less than 1 ms); otherwise the excitation effect cannot be accumulated. The scanless approach, on the other hand, disperses the light intensity across the entire neuron, and thus it requires a higher average power to reach the same excitation strength as the spiral scan approach. However, as the entire neuron is being stimulated at once, it does not pose a limitation on the kinetics of the opsin. It is also expected that the jitter of the delay between the onset of photostimulation and onset of action potential is smaller. As the opsin decay constant (tau-off) is typically larger than the duration of a single spiral scan (which can be <1 ms), the scanning approach is in favor as it takes lower laser power to photoactivate the neurons. In our experiments, we used C1V1 as the opsin. Using spiral scan, it takes about 1.8 less power to photoactivate the neurons than the scanless approach, with the same photostimulation duration (Fig. 6) [69]. For opsins with faster kinetics, a faster spiral scan or the scanless approach may be preferred.

In this section, we explain the details of how the microscope can be constructed with two beam paths for two-photon imaging and two-photon holographic photostimulation (Fig. 3a). We aim to help the readers understand the inside of a commercial microscope system and meanwhile provide basic guidelines for those who want to home-build the microscope. Here, we use GCaMP6 as calcium indicators, and C1V1-mCherry as the red-shifted opsin. The mCherry can be used to indicate if the opsin is expressed in each neuron. We combine the holographic illumination with spiral scan approach to minimize the laser power onto the brain.

#### Setup of the Two-Photon Imaging Path

(1.1) The imaging laser is typically a wavelength tunable Ti:Sapphire laser with a repetition rate of 80 MHz, and a pulse width of 70~140 fs. For GCaMP imaging, the wavelength can be set to 920~940 nm. A Pockels cell is set up after the laser to modulate the laser power. An optional pulse compressor could be set up after the Pockels cell to optimize the pulse width at the sample.

2.2.4 Detailed Implementation of the Microscope for Simultaneous Two-Photon Imaging and Two-Photon Holographic

Photostimulation

Fig. 6 Comparison between spiral scan and scanless holographic approaches for photostimulation. In the scanning approach, the laser spot is spirally scanned over the cell body; in the scanless approach, a disk pattern is generated by the SLM, covering the entire cell body at once. (a) Photostimulation triggered calcium response of a targeted neuron in vivo at mouse layer 2/3 of V1, for different stimulation modalities. For each


#### Setup of the Two-Photon Photostimulation Path

(2.1) The photostimulation laser can be a high repetition rate fiber laser or a low repetition rate pulse-amplified laser (including optical parametric amplified laser). The latter laser is preferred for holographic photostimulation as its higher pulse energy allows lowering the average photostimulation power per cell,

Fig. 6 (continued) modality, the multiplication of stimulation duration and the square of the laser power was kept constant over four different stimulation durations. The average response traces are plotted over those from the individual trials. (b) <sup>Δ</sup>F/F response of individual neurons on different photostimulation conditions (layer 2/3 of V1, over a depth of 100 ~ 270 <sup>μ</sup>m from pial surface; one-way ANOVA test). For each neuron and each stimulation duration, the laser power used in the scanless disk modality is 1 and 1.8 times relative to that in the spiral scan. For each neuron and each modality, the multiplication of the stimulation duration and the square of the laser power was kept constant over four different stimulation durations. (c) Boxplot summarizing the statistics in (b). The central mark indicates the median, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points (99.3% coverage if the data are normal distributed) not considered outliers, and the outliers are plotted individually using the "+" symbol. In this experiment, the mice were transfected with GCaMP6f and C1V1-mCherry. Repetition rate of the photostimulation laser is 1 MHz. The spiral scan consists of 50 rotations to cover the neuronal cell body, and the scanning speed is adjusted to make different stimulation durations. (Reprinted with permission from Ref. [69])

thus enabling photostimulating a large number of neurons with the overall available power budget. For red-shifted opsin such as C1V1, the preferred laser wavelength is 1040~1080 nm. Similar as the imaging path, a Pockels cell and an optional pulse compressor are set up after the laser.


#### Control Electronics and Software


#### Coordinate Calibration

(4.1) It is critical to register the photostimulation beam's target coordinate with the imaging laser's image coordinate. An autofluorescent plastic slide can serve as the sample, and a 2D holographic pattern can be projected to burn spots on the surface of the autofluorescent plastic slide. The imaging laser can then visualize the burned spots. This can also calibrate the axial focus offset between the two beam paths. An affine transformation can be extracted to map the coordinates between the hologram generation algorithm and the actual imaging system. This can be repeated for a few defocusing depths set in the spatial light modulator, and a linear interpolation of the mapping can be applied for the depths in between.

(4.2) Due to the chromatic dispersion and finite pixel size of SLM, the SLM's beam steering efficiency, also called diffraction efficiency, drops with larger angle, leading to a lower beam power for targets further away from the center field of view (in xy), and nominal focus (in z). This efficiency drop can be analytically calculated [55, 75] or experimentally measured by raster scanning the photostimulation beam on the autofluorescence slide and detecting the fluorescence strength. A linear compensation can be applied in the weighting coefficient among different points in the target pattern to counteract this non-uniformity (see Note 3).

#### Optogenetics Experiment


#### 3 Identification and Targeting of Neuronal Ensembles Related to Behavior

3.1 Background The implementation of simultaneous two-photon imaging [11] and two-photon optogenetics [12] allowed the recording and manipulation of tens of cells simultaneously [14, 66, 69]. However, the identification of the neurons that could have a robust impact on the overall population activity also represents a critical step in the design of experiments aiming to control learned behaviors [4, 21, 81].

3.1.1 Multidimensional Reduction Techniques Applied to Population Recordings It has been demonstrated that the activity of neuronal ensembles could be defined as an array of multidimensional population vectors, where the dimensionality of the array is given by all recorded cells in the field of view [5, 8, 82]. Similar population vectors could be visualized by diverse computational techniques that define clusters in a reduced dimensional space, where each cluster depicts a neuronal ensemble [5, 83]. A neuronal ensemble represents a group of neurons with coordinated activity that repeats at different points in time [10]. The characterization of brain states using multidimensional population vectors is independent of the recording length and could be implemented in chronic experiments where the activity of identified neuronal ensembles could be compared at different days.

3.1.2 Targeting Visualized Neuronal Populations with Two-Photon Optogenetics

#### 3.2 Implementation of Analytical Methods to Recall Neuronal Ensembles Relevant to a Learned Behavior

3.2.1 Motion Correction, Identification of Neurons, and Spike Extraction

Two experimental designs are possible to control behavior targeting visualized neuronal populations (see Note 5). One solution is to target all available neurons that could respond to a given behavioral cue with single-cell precision [14, 16, 18]. The other solution is to target neurons with pattern completion capabilities that could recall physiologically relevant neuronal ensembles related to behavior [4, 6, 10, 19, 21].

In cortical circuits, the synchronous activation of several neurons could simultaneously result in two unwanted scenarios: (i) the generation of epileptiform activity or (ii) the forced engagement of GABAergic circuits. In both scenarios, the physiological recalling of a specialized neuronal ensemble is compromised (see Note 5).

On the other hand, the identification and targeting of neurons with pattern completion capabilities could keep the balance between excitation and inhibition inherent to the microcircuit under study, allowing the recalling of physiological neuronal ensembles that work as attractors [84] evoking a given behavior [84, 85].

In this part of the chapter, we describe the procedures and concepts to analyze population activity extracted from calcium imaging recordings. We will focus on the steps after individual neurons have been identified from imaging recordings and their activity has been inferred from calcium transients.

Simultaneous population recordings allow the characterization of population activity with single-cell resolution but generate large datasets representing an analytical challenge. A common practice used for analysis of simultaneous population recordings is to create a binary representation of neuronal activity where 1's indicate firing

Fig. 7 Spike inference from holographic calcium imaging recordings. (a) Representative experiment where two focal planes were recorded simultaneously using holographic two-photon microscopy. (b) Regions of interest (ROIs) detected from the recordings in <sup>a</sup>. Each ROI corresponds to an identified neuron. (c) Representative changes in fluorescence obtained from ROIs shown in <sup>b</sup>. (d) Binary arrays representing the activity of one neuron obtained from inferred spikes. (Modified from Ref. [5, 10])

and 0's indicate silent periods [82, 86]. Recently, several independent research groups have released open-source code to preprocess calcium imaging recordings to extract activity information from a series of images [87–90].

There are four main steps that need to be considered before performing population analysis on binary arrays (Fig. 7):


Fig. 8 Neuronal ensembles represented as multidimensional population vectors. (a) Schematic representation of different frames where active neurons are shown in black-filled circles (left). Binary matrix illustrating the overall population activity, where each row is a neuron and each column denotes a population vector. The total number of recorded neurons gives the dimensionality of the array. (b) Neuronal ensembles representing recurrent groups of neurons that are active at different times can be understood as clusters of similar population vectors in a multidimensional space. The cosine similarity could be used as a metric to calculate the angle between population vectors. If the angle is close to 0, the population vectors are similar, therefore almost the same neurons fired at different times. If the angle is close to 90, the population vectors are different. (Reprinted from Ref. [10])

Each row in the binary matrix depicts the activity of one neuron and each column in the binary matrix represents the activity profile of a neuronal population (Fig. 8a). To visualize the overall network activity, the binary matrix can be plotted as a raster plot where 1's are dots. The population synchronicity can be extracted from the time histogram of the raster plot [5, 8].

3.2.3 Multidimensional Population Vectors Defining Neuronal Ensembles To identify groups of neurons with coordinated activity, population vectors from a physiologically meaningful time window are constructed; time windows from 100–500 ms have been shown to reflect physiological functions and generate similar results. Population vectors capture the coordinated activation of specific groups of neurons. Once the population vectors are defined, comparing the distribution of coordinated activity that could appear randomly against the observed coordinated activity can show population vectors above chance levels [5, 7]. Only population vectors with more active cells than the ones expected by chance should be considered for further analysis. Significant population vectors capture network activity defining a multidimensional space in which the number of dimensions is dictated by the total number of identified neurons with coordinated activity above chance levels [4–6, 8, 10]. It is important to highlight that to characterize population activity, each dot in a multidimensional space should be a population vector instead of an individual neuron (Fig. 8b).

3.2.4 Similarity Measurements on Population Vectors It has been shown that the use of significant population vectors can be used to discriminate similar patterns of activity repeated at different times [4–6, 8]. The representation of network activity as population vectors allows an exhaustive comparison of all population vectors to visualize different experimental conditions. Similarity maps of population vectors represent a valuable tool to visualize recurrent activity of neuronal ensembles. To construct similarity maps, all possible combinations of vector pairs need to be computed. Since population vectors representing the activity of a given group of neurons point to a similar place in a multidimensional space, the angle between population vectors could be used to create a similarity map [8, 82]. The cosine of the angle between a pair of population vectors is defined by their normalized dot product: cos (θ)¼V1 <sup>l</sup> V2 / llV1ll llV2ll. Thus, if the angle is close to zero, the vectors are similar whereas if the vectors are different, they tend to be orthogonal (Fig. 8b).

3.2.5 Identification of Neuronal Ensembles from Population Vectors

Similarity maps constructed from all possible vector pairs could be understood as a low-dimensional representation of the original network activity, where the angles between all population vectors as a function of time are highlighted. Thus, high similarity values in the same row denote recurrent groups of neurons firing together. A cluster of population vectors pointing in a similar direction gives the definition of a neuronal ensemble using the population vector approach. Similarity maps can be transformed to a binary matrix S of size [T T] (Fig. 9a, b). The factorization of <sup>S</sup> using singular value decomposition (SVD) factorizes S as the multiplication of three factors: V, ∑, and VT , where V and VT are orthonormal basis and ∑ contains the singular values [5, 8]. To detect neuronal ensembles from recurrent patterns of activity observed in the matrix S, the rate of singular values decay is determined (Fig. 9c). Taking the singular values above chance levels can reproduce the original matrix S with high accuracy (Fig. 9d). Then each factor of the SVD decomposition represents a linearly orthogonal component and each factor defines a neuronal ensemble (Fig. 9e). In the case of primary visual cortex, each neuronal ensemble represents a different orientation of drifting-gratings, a different natural scene or the Go signal from a visually guided Go/No-Go task [4, 5, 7]. Another approach to identify neuronal ensembles is the projection of the multidimensional population vectors into a low dimensional space, in such reduced dimensional space clusters of population vectors depicted with cluster analysis define neuronal ensembles or network states (Fig. 8b) [82]. The advantage of using SVD factorization is that the total number of neuronal ensembles could be systematically detected from the magnitude of the singular values. Thus, recurrent population vectors can be assigned to a given neuronal ensemble and population vectors with low repetition rate are excluded.

Fig. 9 Neuronal ensembles defined by similarity maps of population vectors using singular value decomposition (SVD). (a) Similarity map illustrating the angles between all possible pair combinations of population vectors. Note that the recurrent activation of a given group of neurons is visualized as increased similarity index in the same row. An angle between two vectors close to 0 represents a similarity value close to 1, whereas an angle between two vectors close to 90 represents a similarity value close to 0. (b) Binary matrix computed from the similarity map in a, representing significant patterns of activity factorized by SVD. Black patterns in the same row indicate recurrent coactive neurons at different times. (c) Magnitude of singular values used to define the number of neuronal ensembles repeated above chance levels. Red line indicates the double of singular values from shuffled data. The cutoff indicates the number of neuronal ensembles that account for <sup>&</sup>gt;90% of the data. In general microcircuits of ~100 neurons stimulated with four drifting-gratings can be defined by ~6 neuronal ensembles. (d) Binary matrix of the first 6 factors obtained from SVD of <sup>b</sup>. (e) Factors from SVD that reproduce the overall response of the imaged focal plane to visual stimuli. Bars on top indicate orientations of drifting-gratings. Empty squares represent spontaneously active neuronal ensembles that appeared in the absence of visual stimuli. (Modified from Ref. [5])

3.2.6 Pattern Completion Properties of Neuronal Ensembles

In the neuronal ensemble framework, pattern completion refers to the ability to recall a whole neuronal ensemble by the activation of few neurons with strong functional connectivity [6]. It has been recently shown that the stimulation of the same neuronal population for several times (~100) was able to imprint an artificial neuronal ensemble in layer 2/3 of primary visual cortex of awake head fixed mice [6]. Imprinted ensembles were composed by neurons that had low probability to fire together, but after the imprinting protocol, the stimulated neurons fired spontaneously in the absence of stimuli (Fig. 10a). The mechanism governing the creation of artificial neuronal ensembles could be explained by Hebbian synaptic plasticity [94], where the connectivity between neurons firing together was strengthened (Fig. 10b). Indeed, neurons responding to a given drifting-grating or belonging to a neuronal ensemble have increased probability to be connected [8, 95]. Imprinted

Fig. 10 Imprinting and recalling of artificial neuronal ensembles in layer 2/3 of primary visual cortex. (a) Spatial map of neurons activated with two-photon optogenetics several times (~100). Red neurons represent repeatedly responding neurons (left). Scale bar: 50 <sup>μ</sup>m. Imprinted ensemble (red neurons) shows spontaneous activity in the absence of any stimuli (middle). The activation of one neuron with pattern completion capability (blue arrow) was able to recall the imprinted ensemble (red neurons). (b) Cartoon illustrating how imprinted and recalled ensembles could be explained by the strengthening of connections of pre-existing ensembles and photostimulated neurons. (Modified from Ref. [10])

ensembles could also be recalled by the stimulation of individual neurons with pattern completion capabilities (Fig. 10), suggesting that pattern completion could be a general property of different brain areas [21].

3.2.7 Recalling Neuronal Ensembles Related to Behavior To prove the causal relation between neuronal activity and a learned behavior, it is necessary to identify representative members of neuronal ensembles that could trigger the behavior [81]. During the behavioral training phase photostimulation should be avoided (see Note 6). It has been recently shown, in a visually guided Go/No-Go task, that neuronal ensembles responding to drifting-gratings in layer 2/3 of primary visual cortex increase their reliability for the Go signal whereas reduce their reliability for the No-Go signal [4]. Increased reliability is reflected as strengthened functional connectivity of the Go ensemble allowing neurons with pattern completion capabilities [21] to recall the whole Go ensemble and trigger the perception of the Go signal evoking the learned behavior (Fig. 11).

#### 4 Considerations for the Implementation of a Visually Guided Go/No-Go Task


Fig. 11 Controlling behavior using pattern completion properties of Go ensembles. (a) In mice trained in a visually guided Go/No-Go task, the Go signal activates a neuronal ensemble related to the correct execution of the task (green neurons). The activation of the Go ensemble produced licking. (b) Activation of as few as two pattern completion neurons (red neurons) belonging to the Go ensemble is able to recall the whole Go ensemble and produce licking even in the absence of visual stimuli or behavioral cues. (Reprinted from Ref. [4])


#### 5 Notes


#### 6 Outlook

A general development direction of all-optical methods is to increase the throughput (i.e., field of view and spatiotemporal resolution) of both imaging and photostimulation, while reducing the laser power incident onto the brain. Beam multiplexing techniques are promising approaches to increase the imaging throughput [56–59]. Larger spatial light modulators with higher pixel count could increase the beam diffraction efficiency and thus the field of view [14]. Faster spatial light modulators could allow rapid switching of the hologram and thus could enable faster photostimulation of neuronal ensembles in specific spatiotemporal patterns. As the simultaneously photostimulated neurons across a 3D volume increase [14, 66, 69, 72], there could be off-target effects through photostimulating the dendritic arbors crossing the cell bodies of target neurons. Somatic restricted viruses where the opsin only expresses in the cell body could alleviate this issue [14, 66, 72, 96].

To access deeper brain regions noninvasively, adaptive optics [97, 98] and three-photon excitation [38, 39, 41, 99, 100] could be used. A spatial light modulator in the photostimulation path could implement adaptive optics, and a spatial light modulator in the imaging path could allow for similar corrections. Three-photon calcium imaging has also been successfully demonstrated [39, 41, 99, 100]. One current challenge is that it needs high power for three-photon optogenetics in deep brain regions. The development of high-efficiency opsins optimized for three-photon excitation could overcome this challenge.

The development of all-optical techniques to simultaneously read and write patterned activity in neuronal populations could help the understanding of perception, memory formation, behavioral states, and pathological conditions with single-cell resolution in the next decades. The systematic documentation of the causality between neuronal ensemble activity and brain states requires further development of analytical tools to visualize and target specific neurons that can control the overall network activity [81]. The adaptation of artificial intelligence algorithms used in big data to understand population activity could be used to design and guide in vivo experiments [13].

Scaling of these techniques to several brain areas to measure and control thousands of neurons in different parts of the brain will require the identification and targeting of neurons with pattern completion capability [21, 81]. This approach could control thousands of neurons by the activation of a small percentage of them, reducing sample heating and deterioration of spatial resolution that comes with increased number of parallel stimulations.

Finally, the development of transgenic mice with genetically encoded calcium or voltage indicators and opsins that use different wavelengths in molecularly identified subpopulations of neurons [20] will allow chronic recordings and stimulation of several brain areas with single cell and molecular identity precision, which represent the next challenge for neuronal ensemble research.

#### Acknowledgments

CONACYT (INFR-2018-294756 [LCR], CF6653 [LCR]), Burroughs Wellcome Fund (Career Award at the Scientific Interface #1015761) [WY], National Eye Institute (R01EY011787 [RY]), National Institute of Mental Health (R01MH115900) [RY], National Institute of Neurological Disease and Stroke (R01NS110422 [RY], R34NS116740 [RY], R01NS118289 [WY]), the US Army Research Office (W911NF-12-1-0594 MURI) [RY], and the National Science Foundation (CAREER 1847141 [WY], CRCNS 1822550 [RY]). Rafael Yuste is listed as an inventor of the patent: "Devices, apparatus and method for providing photostimulation and imaging of structures" (United States Patent 9846313). Weijian Yang and Rafael Yuste are listed as inventors of the patent: "System, method and computeraccessible medium for multi-plane imaging of neural circuits" (United States Patent 10520712).

#### References


term synaptic plasticity in cortical networks. Int J Neural Syst 25:1550026


motifs supporting short-term memory. Nat Neurosci 24:259–265


Boyden ES, So PTC (2017) Wide-field threephoton excitation in biological samples. Light Sci Appl 6:e16255


two-photon microscopy. Proc Natl Acad Sci U S A 110:13138–13143


sub-millisecond precision. In: Optics in the Life Sciences. The Optical Society of America, BrM3B.4


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Chapter 12

## Illuminating Neural Computation Using Precision Optogenetics-Controlled Synthetic Perception

## Jonathan V. Gill, Gilad M. Lerman, Edmund Chong, Dmitry Rinberg, and Shy Shoham

#### Abstract

Connecting neuronal activity to perception requires tools that can probe neural codes at cellular and circuit levels, paired with sensitive behavioral measures. In this chapter, we present an overview of current methods for connecting neural codes to perception using precision optogenetics and psychophysical measurements of synthetically induced percepts. We also highlight new methodologies for validating precise control of optical and behavioral manipulations. Finally, we provide a perspective on upcoming developments that are poised to advance the field.

Key words Optogenetics, Holography, Olfactory, Psychophysics, Two-photon excitation

#### 1 Introduction

A fundamental question in systems neuroscience is how sensory stimuli are encoded by the activity of neurons. This question has been the subject of intense scrutiny across sensory systems, leading to fundamental discoveries describing the nature of neural representations across many levels of information processing and abstraction [1, 2]. The rapid pace of this progress has been matched with an improvement in tools used for recording neuronal activity at a large scale (e.g., two-photon (2P) calcium imaging), leading to the now-routine generation of datasets comprised of hundreds to thousands of simultaneously recorded neurons in awake behaving animals [3, 4]. A consistent observation made from recordings taken across sensory systems is that neural responses to sensory stimuli include changes in both the rate and timing of action potentials within populations of neurons, as well as modulation with respect to ongoing, internally generated rhythms (sniffing, whisking, locomotion, etc.) [5–10]. Despite the ability to measure complex sensory responses, it remains unclear what role individual features of neural activity (e.g. rate, timing, etc.) play in perceptually guided behavior.

One way to pose this question is to ask what features of sensory-evoked activity are read by downstream brain areas to guide behavior. To answer this question, one must establish a causal link between features of neural activity (e.g., the firing rate and timing of specific neurons) and behavioral report (e.g., detecting a sensory stimulus). Perturbations of neural activity are therefore a key tool for connecting specific circuits and features of spiking activity to perception. Pioneering this conceptual framework, an influential study by Newsome and colleagues demonstrated that the perceptual effects of stimulation could be quantitatively estimated, employing experiments in which a monkey's motionperception-related response was biased by electrical stimulation of the middle temporal visual area (MT) [11]. In these experiments, the monkey could be biased toward reporting a certain direction of motion in an ambiguous stimulus if stimulation was applied to a columnar area of cells which were maximally responsive, or 'tuned', to the corresponding motion direction. One reason why the effect of this stimulation was both predictable and reliable was that the spatial scale of the manipulation (hundreds of micrometers current spread) was on the order of the neural representation (a cortical column).

The advent of optogenetics has permitted increasingly precise interventions across a broad range of neural circuits and behaviors, culminating in the development of two-photon (2P) optogenetics [12–16]. When combined with 2P imaging and holographic targeting, this technique permits the activation of arbitrary sets of pre-selected neurons with up to sub-millisecond precision in vivo [17–22]. Neural representations encoded at fine spatiotemporal scales can now be interrogated using "precision optogenetics," increasing the potential for an even more nuanced understanding of the connection between neural codes and behavior [23–29]. A detailed description of optical methods for combining 2P photostimulation and imaging appears in other chapters of this book. In what follows, we specifically focus on methods developed for combining precise photostimulation with psychophysical measurements of evoked synthetic percepts and on complementary methods used to confirm a precise relationship between optical manipulations and the observed behavior. These methods underlie recent scientific studies on the coding logic of olfactory perception, whose results are described in detail in our recent publications [25, 30, 31]. While the particular features of interest may be determined by the sensory system under study, this chapter highlights general frameworks for connecting synthetically induced activity with perceptual and behavioral readout.

#### 2 Paradigms for Psychophysical Measurement of Synthetic Perception

The rapid development of optogenetics within the past decade has introduced an extensive toolbox of genetically encoded, lightactivated ion channels capable of temporally precise, bidirectional modulation of activity within predefined circuits and cell types [16, 18, 24, 32]. In order to link specific circuits and cell types with their role in generating a sensory guided behavior, a common approach has been to manipulate neural activity while an animal is presented with a stimulus and test the consequence on behavior compared to the normal condition. In this way an experimenter can enhance, eliminate, or "tweak" neural activity to discover how circuits and cell types may be involved in making a particular perceptual judgement.

This approach has undoubtedly enhanced our understanding of which brain circuits and cell types are essential for certain perceptual tasks. For example, inhibiting the visual cortex impairs orientation discrimination [33], and broad, unstructured optogenetic activation of olfactory inputs interferes with concurrent odor recognition [34]. But these manipulations are relatively uninformative about precisely how sensory percepts are formed by those circuits and cell types, that is, how they are encoded by and decoded from their activity.

To address this question, we must manipulate sensory representations at the spatiotemporal scale with which they are encoded. Previous technologies making use of electrical or one-photon manipulations lacked the ability to perturb sets of individual, pre-selected cells, limiting their potential to directly connecting specific neurons and spiking patterns with sensory-guided behavior. However, spatiotemporal excitation technologies from the emerging field of precision optogenetics are capable of addressing representations that are encoded at either the microcircuit or distributed scales. In what follows, we will describe the application of two key techniques.

The first, 2P holographic optogenetics, is currently the only method capable of selectively stimulating individual neurons which can be targeted based on their response properties, cell type, location, and projection targets, along with many other selection criteria [35]. The second, one-photon (1P) patterned optogenetics, is capable of stimulating groups of superficial neurons spread out over many millimeters in arbitrary patterns, which can precisely manipulate discrete representations organized over larger spatial scales, such as olfactory bulb glomeruli [30, 36]. Critically, precision optogenetics has the ability to bypass the sensory organ or peripheral circuits involved in sensory-guided behaviors and directly address the perceptual impact of specifically targeted neurons. In this way, an experimenter can create "synthetic" stimuli, whose cellular composition and timing are exactly known. This allows the experimenter to take a step beyond validating the participation of individual circuits for behavior and to determine precisely what features of the activity of neural circuits guide perception. This is accomplished by holding individual features constant (e.g., what neurons are stimulated), while parametrically varying other features (e.g., timing), and performing sensitive psychophysical measurements of the perceptual impact.

2.1 Using Detection to Test the Relevance of Neural Codes The first experimental strategy is to assess the influence of specific features within neural activity patterns on the detectability of an artificial stimulus. Animal survival relies on the detection of brief, faint cues to signal the presence of food, mates, and predators. A multitude of studies across species have revealed the exquisite sensitivity of sensory systems to their preferred stimuli [37, 38]. Exploring how critical information is conveyed by sensory circuits at the perceptual limit provides a lens to examine the essential coding features connecting neural activity to behavior. However, even in this minimal regime, sensory-evoked responses can be complex, simultaneously encoding stimuli in the identity, rate, and timing of specific neurons. By replacing external stimuli with targeted activation of neurons in sensory areas, we can directly test which features of the observed activity affect the detectability of the artificial stimulus. We can then infer the features of activity essential for perception.

> Early studies using either electrical or 1P optogenetic stimulation revealed that rodents can detect changes in the spike rate of single neurons [39, 40], or across populations of hundreds of neurons [41] in the somatosensory and visual cortices, using stimuli that lasted for hundreds of milliseconds. Additional studies have explored the role of relative spike timing [41, 42], and sensitivity to latency [43, 44] in populations of hundreds to thousands of optogenetically activated cortical neurons. These studies were successful in demonstrating that rodents can perceive minimal perturbations in sensory cortical neurons, and perceptibility may, or may not, vary along certain feature dimensions, like timing. However, the techniques used in these experiments lacked the ability to target specific sets of neurons according to their functional properties or tuning. By optogenetically labeling olfactory sensory neurons expressing a specific receptor (M72-ChR2), a key related study in the olfactory system revealed that mice could detect brief activation (10 ms) of a single glomerulus [45]. This welldefined input channel to the olfactory bulb is a site of convergence where ~5,000 olfactory sensory neurons provide input to ~25–30 mitral and tufted cells (projection neurons), demonstrating that the elementary features of olfactory perception operate at, or below, this spatiotemporal scale.

The recent maturation of 2P holographic optogenetics represents an important opportunity for probing the relevance of neural coding features for detection. This approach has recently been applied across a range of sensory systems (olfactory, visual, somatosensory), where 2P photostimulation of a predefined set of neurons was used as the stimulus to be detected. Because specific neurons can be targeted using this technique, a larger feature space can be explored. For example, a recent study observed that the functional connectivity and orientation tuning of a subset of photostimulated neurons was related to their detectability [23] (see also Chap. 11). Extended photostimulation durations (1s) of a subset of visual cortical neurons "recalled" a larger ensemble response tuned to a particular visual stimulus, and as few as two of these "pattern completion" neurons were detectable. That is, strongly photostimulating a specific small group of neurons (as few as two neurons) evoked the activation of a larger, behaviorally detectable pattern. Another recent study found that mice could detect 250–500 ms photostimulation of ~14 somatosensory (barrel) cortical neurons and that this ability improved with experience but did not depend on the precise neurons targeted [26].

Both of these studies found evidence that mice can detect the photostimulation of a few neurons delivered over a relatively long timescale (250–1000 ms), potentially evoking hundreds of spikes. However, for a range of behaviors animals have been shown to make decisions about salient perceptual cues within very short temporal windows [46–48]. For example, rodents are capable of detecting odorants in less than a single sniff (<100 ms), at extremely low concentrations (as low as 10<sup>12</sup> M) [34, 37, 49, 50]. In this case the very first volley of action potentials generated by inhaling an odor are sufficient to inform an animal's response.

To understand what features of olfactory bulb activity guide rapid detection of faint stimuli, Gill, Lerman et al., 2020 extended the use of 2P holographic optogenetics to probe the detectability of single spikes distributed across small ensembles of olfactory bulb neurons (Fig. 1) [25, 51]. In this study, head-fixed, water-restricted mice were trained to detect the synchronous activation of a group of olfactory bulb neurons composed of mitral cells (excitatory), or a mixture of mitral cells and granule cells (inhibitory). The animals' respiration was monitored so that photostimulation delivery could be timed to a fixed delay after inhalation (20~40 ms), mimicking the sampling of an odor (Fig. 1a, b). A specific, predefined set of neurons were used as the stimulus across several sessions, allowing the experimenters to assess the effects of learning or plasticity on detectability (Fig. 1c). Mice were able to detect the synchronous activation of ~30 neurons at high performance within several sessions.

After mice reached a consistent level of performance, the stimulus was parametrically varied along three feature dimensions: number of neurons, relative timing (synchrony), and latency from

Fig. 1 Testing detection of precision photostimulation. (a) Schematic of photostimulation detection experiment. A head-fixed mouse with a chronically implanted window above the olfactory bulb is positioned in front of a lickspout and pressure sensor to monitor respiration (sniff). (b) Trial structure for detection experiment. A tone signaled the start of trials and photostimulation was timed to a fixed delay relative to sniff (c) Left, neurons in the mitral cell layer (MCs and GCs) co-expressing ChrimsonR-tdTomato (red) and GCaMP6s (green). Thirty neurons were targeted for simultaneous photostimulation (white circles). Scale bar – 40 <sup>μ</sup>m. Right, outcomes for responses to the "go" and "no-go" trials. Red circles indicate stimulation of a particular cell, while empty circles indicate no stimulation. (Adapted from Ref. [25])

Fig. 2 Independent control of multiple activity features. (a) Schematic of stimuli which vary in the number of neurons targeted, but not the timing of activation. (b) Schematic of stimuli which vary in the synchrony across neurons, but not the number of neurons. Stimuli were presented with a mean latency of 45 ms across conditions. (c) Schematic of stimuli which vary in the latency of photostimulation relative to the onset of inhalation, but not synchrony, or number of neurons. (Adapted from Ref. [25])

inhalation (Fig. . In this way, the contribution of each of these features to detectability could be independently assessed. We found that mice were capable of detecting single spikes distributed across <20 neurons, with an average psychophysical threshold of 10–15 neurons. Synchrony was tested by staggering the timing of spikes when the full pattern of neurons was targeted, revealing a strong 2)

dependence on the relative timing of photostimulation across neurons. Finally, we specifically manipulated the latency from inhalation and found that detection did not depend on sniff phase.

These results demonstrated that the exquisite sensitivity of the olfactory system goes far beyond a single sniff or glomerulus, with mice capable of detecting just a few spikes, so long as they occurred within less than 30 ms of each other. The independent control of coding features using precision optogenetics was critical for discovering a previously unknown contribution of relative spike timing to olfactory perception. Future studies will surely extend this approach to test an even broader range of features across sensory systems using even more sensitive behavioral tasks to ultimately reveal the building blocks of perception.

Behavioral Task and Training The previously described experiments all made use of a similar strategy for assessing the detectability of 2P photostimulation. They used a go/no-go behavioral paradigm, in which an animal makes a binary decision about the presence or absence of a stimulus. In this paradigm, an animal is trained to respond in some manner (e.g., licking, lever press, nose poke) for a reward, or withhold a response (e.g., do not lick, press, or poke), to signal whether a "go" stimulus cue is present or absent ("no-go"). Typically, the "go" stimulus is randomly, or pseudorandomly presented on a fraction of trials (typically 0.5), where on the remaining fraction of trials an animal experiences either a "distractor" stimulus or nothing. Animal behavior is progressively shaped to associate the availability of reward with the presence or absence of the stimulus. Once the task is acquired, the experimenter can then change the parameters of the stimulus in either a trial or blockwise manner to determine the sensitivity of animals to each parameter based on the frequency of correct choices or error type.

In the previously described experiments, 2P photostimulation of a predefined set of neurons was used as the "go" stimulus. In all cases the behavioral experiments were conducted in head-fixed, waterrestricted mice outfitted with a cranial window. Detection experiments begin by limiting the amount of water available to an animal to ~1 mL/day, depending on their body weight. After 3–5 days of water restriction, mice will consume their daily allotment, at which point they can begin training to receive water while head fixed. Habituation to head fixation and "lick training" can occur simultaneously, by gently head-fixing na¨ıve mice, and making available a metal spout (lick tube) that will deliver a small droplet of water when contacted by the tongue. Mice can freely lick the tube to receive 2 μL droplets, until they have learned to lick enough times to receive the full 1 mL of water for the day. This has the advantage of forming an association between head fixation and the availability of reward which can decrease overt distress during behavioral

#### 2.2 Technical Implementation of Detection Experiments

training and imaging sessions. A useful tool for detecting licking is a capacitive touch sensor coupled to hypodermic tubing which can be used to trigger the release of water through a pinch valve or solenoid controlled by a microcontroller or computer.

After animals reliably lick for water, it is necessary to shape their behavior to acclimate them to the timing and conventions of the go/no-go task. This can be done by initially training mice to recognize very salient stimuli and thus learn the associations of the task before moving on to more difficult conditions. As pre-training for a 2P photostimulation detection task, this typically takes the form of training animals to detect a high intensity target stimulus for the sensory modality under study (e.g., a high concentration odorant, a high contrast oriented grating, a large amplitude whisker deflection), with another clearly separable stimulus used as the distractor, or "no-go" stimulus, or nothing at all. Alternatively, 1-photon light of the appropriate wavelength for the opsin expressed in the neurons of interest can be used to train animals to recognize artificial photostimulation during this training stage. This has the advantage of more closely mimicking the 2P photostimulation detection task, since both involve artificial activation of neurons in a given area, but has the disadvantage of not clearly mapping onto a specific perceptual stimulus, which makes it difficult to study generalization from real to artificial stimuli. Either way, performing a shaping procedure is essential so that learning trajectories during 2P photostimulation can reflect each animal learning to detect activity in a small ensemble of neurons, and not merely learning the basic contingencies of a go/no-go task.

Perceptual Testing Once the animal's behavior has been shaped, detection of the 2P photostimulation can be tested. If neurons are targeted across days, care must be taken to align each session's field of view with a common template (described in Subheading 3.2). Responses of targeted neurons should be measured outside of the behavioral task for each session in order to determine whether changes in detectability are related to learning, or a change in the ability to photostimulate the targeted neurons.

After detection ability for a set of neurons has been established, the experimenter can vary features of the evoked activity to test their contribution to detection. For manipulations of neuron number, the stimulus is replaced with a hologram targeting all, or a subset of the neurons from the set. It is important to maintain the same power per neuron across conditions. For manipulations of timing, it is important to first measure the average latency and jitter of spikes evoked by 2P photostimulation (described in more detail in Subheading 3.1). To control the timing and power of 2P photostimulation, a Pockels cell can be used for rapid and precise control of light delivery. Alternatively, a shutter can be used; however, it is essential to confirm that the animal cannot hear the shutter if detection is being tested. For experiments testing relative timing of neuronal activation, or synchrony, it is important to use a spatial light modulator (SLM) capable of rapidly switching between holograms targeting subsets of the neurons (ideally <10 ms switching time).

If using a go/no-go paradigm, manipulations can be performed in a trial, or blockwise manner. While changing the condition randomly (e.g., number of targeted neurons) on each "go" trial may seem like an unbiased way to test the feature space, it is often best to test a single condition per block of trials. The reason for this is that performance errors can take two flavors, false alarms and misses. Mice tend to make the majority of their incorrect choices as false alarms, ensuring that they do not miss a potential "go" trial or opportunity to obtain a reward. By testing one condition per block of trials, usually composed of 50% go and 50% no-go (no stimulus), false alarms are readily interpretable, as the false alarm rate may increase as the stimulus condition becomes more difficult to detect (e.g., less neurons targeted), even if the number of "hits" does not decrease. If different conditions are randomly interleaved in a trial-by-trial manner, one relies only on miss rate to determine the differences in detection performance, since all conditions share the same false alarm rate, significantly reducing the sensitivity of this measure.

2.3 Measuring Perceptual Distance of Synthetic Percepts The second experimental strategy developed recently to connect neural activity features to perception is to measure the perceptual distance of synthetic percepts evoked by optogenetic stimulation. Experiencing a familiar object, for example, a rose, evokes a complex spatiotemporal pattern of activity in sensory areas. What features of this activity determine the identity of the object as "rose" and not another object like "tulip" or "orange"? Are some features generally better at explaining the differences in how stimuli are perceived, regardless of the specific stimuli being compared? By determining these features, we can expose the computational strategies underlying perceptual identity.

Traditionally, it has been difficult to determine the perceptual relevance of different coding features using natural (non-optogenetic) stimuli. One reason for this difficulty is that different features of neural activity often covary with one another as stimulus identity is changed. For example, presentation of a rose, tulip, and orange may each evoke changes in both the firing rate and timing of overlapping sets of neurons. Which of these features (cell identity, rate, timing, etc.) is essential for the animal to discriminate "flower" from "fruit," or to identify "rose" specifically? While expanding the stimulus set to include more diverse examples may help tease out responses unique to each class or exemplar, biophysical constraints often impose correlations between features, and it can be difficult or impossible to design stimulus sets that fully disentangle their contribution.

The use of natural stimuli to determine perceptually relevant coding features is further hampered by generating inferences purely through correlation. While some features of activity may appear to predict perceptual choices made by the animal, they may not have any actual influence on the behavior. For example, the spike rate of neurons in a particular brain region may be highly predictive of whether an animal will classify a stimulus as "flower" or "fruit," but it need not be the case that any of these neurons is actually relevant for the discrimination. If the experimenter was to manipulate the spike rate of the neurons (activating or silencing them), they may find no effect on the choices of the animal, despite the strong correlation of the spike rate to the behavior in the normal condition. In this way, merely observing activity and relating features to behavior has limited power for testing causal models of perception.

Precision optogenetics therefore provides an opportunity to independently manipulate activity features and determine their differential impact on perception. Further, this technique provides the opportunity to test whether behavior is guided by combinations, or conjunctions of features (e.g., rate and tuning, or sequential order and phase). By creating fully synthetic stimuli using precision optogenetic stimulation, one can manipulate activity features while an animal performs a recognition, or discrimination task, in which they signal the perceived identity of a stimulus through their behavioral choice. By measuring the frequency with which animals categorize induced activity patterns to be the same, or different percepts, one can determine the relative perceptual distance between synthetic stimuli. Finally, this approach allows the experimenter to determine what individual or combinations of features define the axes of perceptual identity for a particular neural circuit.

This strategy has recently been used in our work to probe the perceptual relevance of different coding features in the mouse olfactory system. Chong, Moroni, et al. (2020) used a combination of genetic labeling (OMP-ChR2, expressing ChR2 only in olfactory sensory neurons [52]) with precise optical targeting using a high-resolution digital micromirror device to project light patterns onto the surface of the olfactory bulb, evoking activity patterns with single glomerulus resolution (Fig. 3a) [30]. In this way, we created synthetic odor stimuli by activating sets of glomeruli with high spatial and temporal precision. We trained water-deprived mice to recognize a single spatiotemporal pattern as a target "odor," and to report activation of this pattern by licking a waterspout. The mice were trained to discriminate this stimulus from every other non-target pattern (any spatiotemporal pattern not identical to the target), and to report any non-target pattern by licking a different waterspout (Fig. 3b).

Fig. <sup>3</sup> Spatial and temporal perturbations of synthetic odors. (a) Schematic of the experimental setup. Dorsal olfactory bulb was exposed by a chronically implanted 3 mm window. Spatiotemporal stimulation patterns, created by a digital micromirror device, were projected onto the olfactory bulb of a head-fixed OMP-ChR2 mouse in front of a pressure sensor for sniff monitoring, and lick spouts delivering water. (b) Schematics for pattern discrimination task. Animals were trained to recognize Target versus Non-target patterns defined on a stimulation grid. Target patterns comprised six spots, initialized randomly but fixed across subsequent sessions, activated in an ordered sequence defined in time where 0 marks inhalation onset. Non-target patterns were six off-Target spots, randomly chosen from trial to trial, with randomized timing within 300 ms from inhalation (~single sniff). (c) Illustration of spatial perturbations: One or multiple spots in Target patterns were randomly replaced with Non-target spots. (d) Illustration of temporal perturbations in which one or multiple spots in Target patterns were temporally shifted. (Adapted from Ref. [30])

After mice learned to perform this task with high accuracy, we tested how recognition changed as we systematically manipulated the activity patterns across several feature dimensions (Fig. 3c, d). Mice experienced a small proportion of "probe" trials (10% of total trials), in which the target stimulus was modified, either by replacing spots with non-target spots (Fig. 3c) or by shifting spots in time (Fig. 3d). We measured the fraction of trials in which mice reported the modified pattern as being like the "target" pattern. This proportion reflected the perceptual distance between the modified and original target pattern.

To determine the relative influence of the spatial and temporal features on perception, the experiments of Chong et al. 2020 used precise parameterization of perturbations to the target stimulus. One key finding was a primacy effect in which perturbations to earlier activated glomeruli had larger effects on perceptual responses than later activated glomeruli. Despite the fact that animals could use any available activity features to solve the target vs. non-target categorization, such as glomerular identity or timing, this result suggests that these features do not carry equal weight for informing an olfactory percept, as all the tested mice were more strongly influenced by changes to the earliest spots in the pattern without being explicitly trained to use this feature. While both spatial and temporal perturbations were effective at increasing the perceptual distance from the target pattern, the joint assessment of these features demonstrated that odor identity representation is nuanced and determined along several dimensions simultaneously. This study highlights how a fully synthetic approach can be used to establish basic principles of the neural code informing perception. Models derived from these findings could then be tested and further refined by the use of naturalistic stimuli, ultimately closing the loop between causal manipulation and ethological observation.

In the previously described experiments, mice performed a 2-alternative forced choice task in which licking either the left lick spout or the right lick spout signaled a choice of either Target or Non-target pattern perceived by the animal. The lick spout assigned to Target or Non-target should be randomly determined for each experimental animal to control for any systematic side bias. Trials consist of a stimulus period, a grace period in which mice can lick without reward or punishment, and a response period in which the first lick determines the animal's choice. The purpose of a grace period is to reduce the influence of impulsive licking on the trial outcome and is typically ~0.5 s following the stimulus period. This period provides time for an animal who may have been licking in response to the trial initiation to change to an informed licking pattern after experiencing the stimulus. Initially, mice will often be biased toward licking one side over the other [53], so it is

2.4 Technical Implementation of Perceptual Distance Experiments

important during training to perform a de-biasing procedure which adaptively increases the incidence of trials on the biased-against side [54]. By doing so, side biases can be eliminated before initiating critical phases of the experiment.

The first lick during the response period is counted as the animal's choice and the trial ends, leading to an inter-trial-interval before the next trial. Reward will be provided to the animal at different rates depending on the phase of the experiment. Initially, during the shaping period, animals are trained to discriminate between one Target and one Non-target pattern until animals exceed a threshold of performance (80%). During this phase, animals are rewarded for correct choices 100% of the time. After the initial shaping, Non-target patterns are randomly initialized on each trial, and due to the combinatorics, they never repeat. The animals are also shaped toward a 70% reinforcement rate. At this point, the animals experience test sessions in which probe trials make up 10% of the total trials while Target and Non-target trials comprise 45% each. These probe trials involve perturbations to the target pattern along feature dimensions under study (spot identity, relative timing, shift with respect to respiration, etc.), and are randomly interleaved with the Target and Non-target trial types. Critically, probe trials are never rewarded, allowing the animal's choices to reflect their internal Target vs Non-target category boundaries.

#### 3 Validating Precise Manipulation

3.1 Characterizing the Scale and Timing of Response to Stimulation

In order to interpret the results of an experiment combining precision optogenetics with behavior, it is critical that the characteristics and specificity of the neural response to the manipulation be known. For example, if one wants to study how the relative order of neurons activated by 2P photostimulation impacts the perceptual quality evoked by the manipulation, it is critical that the timing of cellular responses to stimulation be known, lest the neurons respond out of order. Similarly, if an experiment relies on comparing how two different groups of neurons are perceived when they are separately activated, it is critical to determine the specificity with which the groups can be targeted, as inadvertent activation of neurons in the wrong group could confound the inferences made from an animal's behavior. We will cover three useful methods for assessing the scale and timing of responses to all-optical targeting and manipulation in vivo which may aid in the design and interpretation of behavioral experiments.

Targeted Electrophysiology Prior to initiating a study utilizing all-optical methods such as 2P photostimulation for combination with behavior, it is strongly advisable to assess the responses of

Fig. 4 Targeted electrophysiology. (a) 2P-guided cell-attached electrophysiological recording in an awake C57/BL6 mouse mitral cell layer co-expressing ChrimsonR-tdTomato (red) and calcium indicator GCaMP6s (green). The neuron (white circle) was targeted by a light patch (scale bar, 20 <sup>μ</sup>m). (b) Examples of electrophysiological recordings during 5, 10, and 20 ms photostimulation. (c) Example raster plot for 10 repetitions of 10 ms photostimulation with 30 mW average power. The response latency is defined as the time from photostimulation onset to the first spike, and jitter is defined as the standard deviation of the latency across photostimulation repetitions. (Adapted from Ref. [25])

neurons to a range of photostimulation parameters (power, duration, frequency, etc.). In this way, the all-optical system can be properly "tuned" to provide an expected response for a given opsin, cell-type, tissue depth, etc. (see Note 3). Efficacy and timing of evoked responses to stimulation using "standard" approaches, such as holographic patch, or spiral scanning of a focused beam have been characterized in the literature for only a very limited range of conditions, and rarely in vivo (though a number of recent studies have helped to fill this knowledge gap [18–20, 25, 28]. Currently, single-unit electrophysiology is the standard for measuring photocurrents and evoked spiking with high temporal precision. Combining high-impedance electrophysiological recordings performed with a glass pipette and 2-photon imaging, it is possible to target and record from specific predefined neurons (Fig. 4a).

To perform targeted recordings, one must begin with an appropriate amplifier, digitizer, and program for recording and aligning electrophysiological data to photostimulation delivery. As the main interface to the tissue, borosilicate glass pipettes pulled to 5–11 MΩ (~1–1.5 μm tip size) are appropriate when coupled to an electrode holder with an outlet for pressure regulation. For juxtacellular recordings (cell abutting, but not internal), electrodes can be filled with a modified current-clamp "external" solution (130 mM K-gluconate) containing fluorescent dye for visualization. In practice, neurons are often labeled with both a calcium indicator (GCaMP6, jRGeCO1a, etc.) and a fluorophore indicating the expression of an opsin (ChrimsonR-tdTomato, Chronos-EYFP, etc., though see Note 1), which makes visualizing the thin tip of a glass pipette difficult, since both green and red imaging channels contain information. To solve this, an approach is to use a mixture of green and red dyes (Alexa 488/594 mixture, Milipore) when preparing the pipette solution, which allows the pipette to be viewed simultaneously on both the green and red imaging channels, allowing it to stand out to some degree from the surrounding tissue.

It is optimal to perform recordings in conditions as similar as possible to behavioral conditions, so it is recommended to perform recordings in awake, head-fixed animals. To achieve this, it is important to have a stable, chronically implanted preparation that permits repeated electrophysiological access to the targeted area, as well as optical clarity for 2P imaging. Both of these requirements can be met by implanting a glass cranial window (Warner) with pre-drilled holes filled with silicone elastomer (Quik-Sil). If appropriately mixed, the elastomer will remain transparent and create an air-tight seal in the cranial window, allowing it to be implanted chronically. Prior to the recording session, the silicone plug can be removed with a pair of fine forceps, exposing the tissue underneath. The brain should be bathed in sterile saline or artificial cerebral spinal fluid for the duration of the recordings, after which the cranial window can be re-sealed using the same technique. This method permits multiple recording sessions for each animal, and can drastically increase the yield of stable recordings, since the implanted cover glass also effectively minimizes brain movement.

For the purpose of measuring the temporal resolution of an all-optical manipulation, measuring spiking delay from the onset of photostimulation and the jitter of the timing of the evoked spiking are essential (Fig. 4b, c). To estimate the spatial resolution of the manipulation, moving the focus of the stimulation laterally (x,y), and in depth (z), should be performed by fixing all features of the photostimulation and moving the objective a fixed amount while recording the effect on spiking. From this, one can measure how far from the targeted neuron the photostimulation could evoke spiking and infer how near another neuron would need to be, to be inadvertently activated by the photostimulation. While this is a useful estimate, in practice "off-target" activation depends on many factors, including the geometry of the area under study and the sparseness of opsin expression, to name a few, meaning the measured lateral and axial resolution is not the same as the effective resolution of the system, which can be estimated using methods in the following sections.

Network Responses to Photostimulation It is highly desirable to demonstrate that the scale of the manipulation employed in an experiment combining all-optical manipulation with behavior is matched to that of the neural representation being probed. For example, if targeting individual neurons tuned to a particular orientation in visual cortex, and these neurons are interdigitated with neurons tuned to other orientations in a "salt and pepper" organization, it is sensible to demonstrate single-neuron resolution of photostimulation. A technique to estimate the effective scale of a manipulation is to target individual neurons composing some representation and observe via 2P calcium imaging the effect of stimulation on the non-targeted neurons in the field of view (FOV). For example, if an experiment involves simultaneously targeting 30 neurons that share the same tuning, it is useful to target each of these cells individually and measure the effective network response to each target in addition to targeting the full pattern. In this way, one can screen for undesirable off-target effects or reject neurons that do not respond on their own. Both of these factors could be harder to observe when targeting the full pattern, because a great degree of network modulation is expected as the number of targeted neurons increases, and targeted neurons that are modulated by the full pattern may not actually be responding to photostimulation directly, but just modulated along with the rest of the network.

To screen for off-target effects, or inadvertent photostimulation of neighboring neurons, it is useful to look at whether excitatory responses to untargeted neurons cluster around photostimulation sites (Fig. 5). A common way of representing the spatial extent of the manipulation is to bin the average response of all neurons in the imaging plane by their centroid distance to the photostimulation targets (Fig. 5a). Then one can plot the distribution of responses across a range of radial eccentricities for neurons that do and do not express the opsin (Fig. 5b, c). To visualize this distribution, it is useful to compute a spatial heatmap of the average response to photostimulation across neurons (Fig. 5b). To perform this analysis, for each single-cell stimulation target, compute the average response in a brief time window (e.g., 100 ms) following photostimulation for all the neurons in the field of view expressing the opsin. Using the spatial footprint of each cell body (outline), create an image where each cell body spatial footprint is labeled (colored) with the average response to a photostimulation target, and repeat the procedure to produce images for all the photostimulation targets. Finally, shift the x–y position of the images to align the centroids of the photostimulation targets and average across the images to produce a spatial heatmap of the response to photostimulation (excluding space between neurons from the average). From this visualization, one can determine whether there is a spatial bias of excitatory or inhibitory responses in the area surrounding the targeted neurons. While an idealized version of the result would be that excitatory responses disappear within one neuron's radius away from the photostimulation targets, in practice this is unrealistic. For example, there may be a large degree of local excitatory connectivity which falls off with eccentricity, making a gradually decaying response profile fully expected. How then can you disambiguate synaptically (network) driven responses in

 Fig. 5 Assessing specificity of photostimulation-evoked activity. (a) Example FOV centered on the mitral cell layer of a Tbet-Cre mouse. One mitral cell is targeted for 2P photostimulation (small white circle, 15 <sup>μ</sup>m diameter). Red labeling corresponds to FLEX-ChrimsonR-tdTomato expression limited to a subset of mitral cells, and green labeling corresponds to pan-neuronal GCaMP6s expression. Normalized fluorescence (ΔF/F) is averaged for every neuron in the 100 ms period following photostimulation and averaged across neurons occupying the same radial distance from the target (example bin size <sup>¼</sup> 50 <sup>μ</sup>m). (b) A spatial heatmap of average response vs. ROI position centered on 49 mitral cell targets (n <sup>¼</sup> 2 Tbet-cre mice, 138 total neurons, 3631 photostimulations, 3 pulses per photostimulation: 10 ms on – 10 ms off, at 30 mW, or 0.19 mW/<sup>μ</sup>m2 ). Only cells labeled with ChrimsonR-tdTomato were included. The "targeted" bin is outlined by a black circle. (c) The average radial decay of responses across ChrimsonR-tdTomato positive neurons (Opsin+, red) and ChrimsonR-tdTomato negative neurons (Opsin, green). Cell responses radially binned and averaged for each targeted neuron and bin means were averaged across targets (mean s.e.m., 49 targeted neurons, n <sup>¼</sup> 2 Tbet-cre mice). Asterisks indicate a significant difference between the average binned response of ChrimsonR+ and ChrimsonR neurons (\*p <sup>&</sup>lt; 0.05, two sample t-test, Holm-Bonferroni corrected for multiple comparisons). (d) A cartoon of the hypothesis that photostimulation activates neighboring Opsin+ neurons due to off-target photostimulation: Opsin+ responses will exceed the Opsin responses, reflecting inadvertent stimulation of nearby Opsin+ neurons. (e) A cartoon of the hypothesis that single-cell photostimulation leads to responses of nearby neurons due to network effects: Opsin+ responses do not exceed the responses of Opsin neurons. (Adapted from Ref. [25])

neighboring neurons from "off-target" photostimulation (Fig. 5c– e)? If opsin expression is sparser than expression of the functional indicator, this means a subset of the neurons in a FOV will be opsinnegative, providing an opportunity to compare their responses to single-cell photostimulation with cells that are opsin-positive (Fig. 5c). If the non-targeted opsin-positive neurons exceed the response of the opsin-negative neurons, this is evidence that the non-targeted opsin-positive neurons are directly activated by the photostimulation laser (Hypothesis 1: "Off-target stimulation," Fig. 5d). If the non-targeted opsin-positive neurons do not respond in excess of the opsin-negative neurons, this is compelling evidence that network modulation can be explained by synaptic effects alone (Hypothesis 2: "Network effects," Fig. 5e, and the experimental outcome in Fig. 5c). Confidence in this interpretation may depend on similar populations of cells being labeled with the opsin in a random and unbiased way or using a preparation which is engineered to be sparsely labeled. However, even in densely labeled samples this approach can be beneficial for identifying non-responsive targeted neurons to exclude from multi-neuron patterns and measuring the spatial scale of the influence of individual neurons in a larger, multiple-neuron manipulation.

Omit-One-Target When targeting many neurons simultaneously, the probability of inadvertently activating untargeted neurons is significantly increased. While somatically targeted opsins and axially confined stimulation techniques like temporal focusing help to decrease the incidence of inadvertent activation, it is beneficial to directly test targeting specificity in this more complex regime. A useful approach is to perform an "omit-one-target" experiment and analysis (Fig. 6). After defining a set of neurons to target simultaneously, one can generate holograms systematically omitting each spot from the full pattern (Fig. 6a). These holograms can be used for photostimulation, with multiple repetitions randomly interleaved into blocks of trials. Then it is possible to compare the response in each neuron when it was targeted vs. when it was omitted (Fig. 6b). While ideally the response of every non-targeted neuron would fall to zero, this is unrealistic, especially if targeting groups of neurons that may compose an interconnected circuit. However, if target selection was biased toward highly photostimulation responsive, or sensitive neurons, a general reduction in response magnitude for the omitted neurons can serve as strong evidence for cellular specificity. Demonstrating that neurons within this population can be individually controlled in the presence of simultaneous photostimulation of a large number of targets validates that manipulations of neuron number, or specific neuron identity, can be properly interpreted from behavioral experiments.

Fig. 6 Characterizing response in omitted photostimulation targets. (a) Schematic of an omit-one-target experiment. Top. Pattern of spots targeting many neurons simultaneously. Bottom. Omit one target condition in which one of the spots is removed from the stimulation pattern. All spots are individually dropped, and the holograms are randomly presented to the mouse along with the "all targeted" hologram. (b) Average response to 10 ms photostimulation. The average response per cell when it was targeted vs. when it was omitted with all other cells targeted. (\*\*p <sup>&</sup>lt; 0.001, two-sample t-test, targeted: 0.08 0.006, vs. omitted: 0.02 0.007, mean <sup>Δ</sup>F/F s.e.m, 28 targeted cells (19 responsive), n <sup>¼</sup> 2 WT mice, 30 photostimulations per datapoint, 10 ms duration, 20 mW/patch, 0.125 mW/<sup>μ</sup>m2 ). (Adapted from Ref. [25])

3.2 Registration Between Photostimulation and Imaging, and In Situ Evaluation of Targeting

In addition to measuring the response characteristics of neurons under study in an all-optical behavioral experiment, it is also essential to characterize the physical position and scale of the stimulus being applied. While extensive characterizations of the point spread function (PSF) of the stimulus should be performed prior to engaging in in vivo experiments, there are several procedures that can be performed on a routine basis during behavioral experiments to significantly increase confidence in the experimental outcomes. These involve registering the photostimulation and imaging arms of the microscope into alignment and methods for online motion correction to correct for slow drift in the positions of targeted neurons. Additionally, we highlight a potentially transformative approach, holographic optogenetic confocally unraveled sculpting microscopy (HOCUS), for in situ evaluation of stimulus shape and position deep in the tissue of awake behaving animals [31].

Calibration It is critical to start by ensuring that the optics of the 2P photostimulation and imaging arms of the microscope have been appropriately selected and aligned to permit imaging and photostimulation of the same FOV. Still, small drifts in the alignment can occur over time. This might be due to temperature fluctuations, mechanical stress, vibrations, changes in laser output angle, among many other factors. Luckily, small translations, rotations, and shearing of the photostimulation FOV relative to the imaging FOV can be compensated for with a simple calibration procedure. A calibration pattern can be burned onto a fluorescent plate by the photostimulation system. Then the plate can be imaged, and the calibration pattern can be used to register the two arms of the microscope by computing the disparity between the desired pattern and observed pattern.

Using a predefined calibration pattern made of several small (~1 μm) targets, deliver 1–2 ms of photostimulation while increasing the average power at the sample until small burn marks appear on the fluorescent plate. The goal is to burn all of the spots in the calibration pattern evenly at a small size. Then, capture an image of the burned pattern and compute the inverse rotation matrix and offset from the desired pattern. This rotation and offset can be applied when generating holograms to exactly compensate for the effects of small drifts in alignment between the photostimulation and imaging arms.

Online Motion Correction For behavioral experiments that require targeting the same neurons for photostimulation across days, it is useful to employ an online motion correction method (see Note 2). In the experiments described by Gill, Lerman et al., 2020, the FOV was first aligned to a reference image manually, then the position was fine-tuned automatically using a custom-designed closed-loop algorithm, implemented as a module within Scan-Image software [25, 55] (Vidrio Technologies). This algorithm attempted to minimize the difference between the reference image and the FOV by iteratively moving the microscope stage (Sutter 285) to reduce the residual displacement computed using a rigid motion correction package (NoRMCorre, Flatiron Institute [56]). The optimization typically converged within 10–15 s once the residual displacement vector was reduced to <0.5 μm in magnitude. In addition to aligning the FOV across days, we performed this routine between consecutive blocks (60 trials, 6–9 min) during each behavioral session to minimize the effect of slow x–y drift due to brain and microscope motion; therefore, ensuring the photostimulation targets remained consistent throughout each session. We monitored for drift in the z dimension as well, which was manually corrected using the reference image between blocks (60 trials, 6–9 min) if necessary, though displacement was typically small (~3 μm in a 1.5 h session).

In Situ Evaluation of Holographic Patterns 2-photon precision optogenetics experiments often rely on holographic wavefront shaping techniques to generate light patterns targeted to individual neurons deep in living tissue. Despite careful alignment between the stimulation and imaging fields of view, light propagating through the brain can undergo significant tissue-induced distortions, mainly through scattering, that lead to a discrepancy between the desired and actual light patterns reaching the neurons. These effects can be unpredictable, as distortions will vary as a function of specific tissue geometry and composition. Distortions are especially detrimental to experiments combining cellular photostimulation with behavior, as the shape and position of holographic patches may be far from desired values, limiting the interpretability of perceptual judgements.

In order to directly measure the effects of tissue-induced distortions on projected light patterns, Lerman et al. 2019 described a new method, holographic optogenetics confocally unraveled sculpting (HOCUS), for real-time, in situ evaluation of holographic light patterns (Fig. 7) [31]. This technique involves confocally descanning reflections from light patterns focused into the brain. Photons emitted from the sample, due to ballistic reflections of light from the photostimulation laser source, are imaged with a confocal detection system. This system can be added to most microscopes combining 2P imaging and photostimulation, making use of scanning optics typically used for 2P imaging in a reverse direction to descan reflections from a static light pattern projected into the tissue (Fig. 7a). Descanned light is focused by an electrically tunable lens to a pinhole to confocally reject out-of-focus light, permitting an evaluation of reflected light patterns at the focal plane (Fig. 7a, b). By combining images collected by 2P imaging and HOCUS (typically averaging frames collected over a second), targeted neurons and holographic patches as they appear in the brain can be viewed simultaneously for evaluation. This technique is capable of measuring tissue-induced distortions unique to a particular field of view and holographic pattern, and enables real-time correction of holographic spot position by adjusting the hologram to compensate for any difference from the desired location. This technique could also be used, in principle, to optimize the generated holograms to compensate for tissue-induced distortions in the shape of the light patches. Ultimately, directly viewing and correcting deviations between the desired light pattern and actual light pattern being used to photostimulate neurons could significantly improve the reproducibility of experiments combining 2P holographic stimulation with behavior.

3.3 Assessing the Reliability of Behavioral Readout While initial measurements of the responses evoked by photostimulation as well as calibrations to ensure proper targeting are essential for experiments combining all-optical manipulations with behavior, they are not the only controls necessary to be confident in a behavioral result. As an additional precaution, it is extremely useful to ensure that the observed behavioral effects depend only

Fig. 7 Visualizing light patterns in vivo with HOCUS. (a) The 2P imaging path is combined with the holographic 2P photostimulation path for in vivo experiments in head-fixed, behaving mice. The reflected stimulation light passes through the PBS, is descanned by the mirrors, and is then reflected by the dichroic mirror through the pinhole onto the detector. PBS polarizing beam splitter, PMT1, PMT2 photomultiplier tubes, SLM spatial light modulator, DET detector. (b) A merged image of GCaMP6s expression (green) and the HOCUS-imaged reflected stimulation light (red) showing a pattern of four light spots projected onto the brain, 120 <sup>μ</sup>m deep in the olfactory bulb, and positioned on four respective neurons. Scale bar: 25 <sup>μ</sup>m. (Adapted from Ref. [31])

on manipulation of neural activity and not other features of the stimulus. We provide two methods for handling this requirement in the 1P pattern stimulation and 2P photostimulation regimes.

Stimulation Masking When designing a stimulus dependency control in the 1P regime, a primary factor to consider is that animals may be able to see the light being used for stimulation. Even if light is delivered through an insulated optic fiber, preventing the animal from detecting the light externally, it is possible that light delivered within the brain can still stimulate the retina. Humans and mice are capable of visual detection at or near the level of single photons, so even moderate light powers in highly scattering media run the risk of inadvertent visual detection through light traveling within the brain and ultimately to the retina [38, 57]. This can seriously affect the outcome of an experiment in which animals are asked to detect photostimulation of groups of neurons, or discriminate between patterns of activation, as an animal could potentially learn to solve the tasks partially, or entirely, using their vision. A common control for this is to perform the same behavioral experiments on animals that lack a functional opsin and demonstrate that they do not perform the task above chance level. This is a useful control for short-term experiments; however, this does not rule out the possibility of a shifting strategy in opsinpositive animals over a longer time course. All-optical experiments exploring the effects of a large photostimulation parameter space on detection or discrimination often take weeks or months to complete. While animals may initially base their choices on direct neural activation, if visual cues exist, it is possible for them to later switch strategies to rely, to some degree, on directly seeing the stimulus. A solution for this is to use a set of "blanking LEDs" matched to the wavelength of light used for stimulation. These can be positioned near the eyes and triggered during behavioral trials. If the blanking LEDs are significantly brighter than the stimulation, this removes the possibility of using the stimulation light as a cue, as the animals will see a strong light of the same wavelength regardless of the stimulation condition.

Sham-Photostimulation Experiments involving 2P photostimulation provide much less opportunity for inadvertent visual stimulation, since the wavelengths involved are typically near-infrared and thus invisible to rodents and primates. Still, 2P photostimulation comes with its own caveats, as the light powers used tend to be a great deal higher than those used for 1P photostimulation. This introduces the possibility that animals could sense changes in lightinduced heating within the brain through its effects on neural activity, or via tactile stimulation of surrounding tissue (see Note 3 and Ref. [58]). Demonstrating an inability of opsin-negative animals to detect or discriminate photostimulation patterns or when targeting opsin-negative neurons does not exclude the possibility that opsin-positive animals engaged in extensive behavioral training could learn to use heat as a cue.

An ideal control would be to provide a version of the stimulus that reproduces all the features of the original photostimulation, but does not evoke spiking in the targeted neurons. For this, we can leverage the non-linear nature of 2P excitation. Since 2P excitation relies on the peak power of laser pulses reaching the sample, increasing the duration of the laser pulses while fixing the average power can effectively reduce 2P excitation, and thus activation of the opsin (Fig. 8a). Many lasers used for 2P photostimulation have an internal or external compressor for dispersion compensation, or to maximize peak power of the output pulses. It is possible to change the pulse width at the sample from a typical value of ~200 fs to >15 ps by adjusting the laser's compressor, provided it has the appropriate range, all while keeping the same average power, or the amount of light delivered to the tissue, constant. Therefore, lengthening the pulses provides the opportunity for a "sham" photostimulation control that can reproduce the heat and possible indirect sensory effects present during a behavioral task, but is capable of eliminating spiking induced by 2P photostimulation by reducing 2P excitation by several orders of magnitude (Fig. 8b, c). By interleaving blocks of trials identical to the typical behavioral experiment, but using the sham photostimulation control, it is

¼ Fig. 8 Sham-photostimulation control. (a) A schematic demonstrating the effect of tuning the laser pulse duration. Time dependence of laser power for pulse trains with the same pulse frequency ( f ) and average power, but different pulse durations: short pulse, <sup>τ</sup>S (red), and long pulse, <sup>τ</sup>L (gray). To photostimulate a cell, laser power must exceed a certain threshold, Pth. (b) Left, a table summarizing the differences in effects evoked by the short and long pulse duration stimuli. Right, a schematic of the behavioral setup for the sham photostimulation control experiment. (c) Representative example raster plots (top) and peristimulus time histograms (PSTHs) (bottom) for short pulse photostimulation (~200 fs, red) and long pulse sham photostimulation (control, 15 ps, gray) (20 trials per condition, 30 mW, 10 ms illumination, n <sup>¼</sup> 1 cell in 1 WT mouse). (d) Detection accuracy as a function of photostimulation condition. During the sham-photostimulation control blocks, detection accuracy dropped to chance level (0.5 0.003, mean s.e.m., p <sup>¼</sup> 0.37, one-sample t-test, 0.06–0.125 mW/μm<sup>2</sup> , n <sup>¼</sup> 5 mice, 2 WT (filled circles) and 3 Tbet-cre (empty circles) and was significantly different from both pre- and post-control measurements ( p <sup>&</sup>lt; 0.001, Fisher's exact test, 0.06–0.125 mW/μm<sup>2</sup> , n 5 mice, 2 WT (filled circles) and 3 Tbet-cre (empty circles). (Adapted from Ref. [25])

possible to confirm to what degree the behavior relies on evoked spiking, and not on other factors (Fig. 8d). This method could even be used to calibrate the appropriate average power delivered during the behavior, as the power could be increased during the sham photostimulation condition until it is detectable, then reduced until it is well below the detectable level for the rest of the experiment.

#### 4 Notes


session that the holographic spots no longer align with the locations of the targets since the brain has shifted relative to the FOV of the microscope. This may lead to a corresponding drop in behavioral performance across the session, and significantly impact the interpretation of the results. Therefore, it is best to re-align the microscope position with a reference image every few minutes either manually or using an online motion correction procedure (e.g., the method described in Subheading 3.2).

3. Increasing the light power delivered to targeted neurons does not always lead to an improvement in photostimulation efficacy and may be detrimental at high values. The spiking response of individual neurons will generally increase in rate and consistency across repetitions when the average photostimulation power is increased, until the response saturates at a range of values that is particular to each neuron. The point of saturation has been exceeded when increasing the average power does not activate significantly more opsin proteins, or when the maximum firing rate of the cell has been reached. Further increasing the photostimulation power beyond the minimal required intensity has several disadvantages including (1) increasing the likelihood of stimulating untargeted neurons, especially in the axial dimension, and (2) causing excess tissue heating which may lead to physiological and perceptual artifacts [58]. At the limit, high photostimulation power may lead to cellular ablation, and, in practice, the range of effective photostimulation powers is often near this threshold (though the precise power limit depends on many factors such as depth, vasculature, wavelength, etc.). For these reasons, prior to conducting a behavioral experiment, it is useful to measure the response of individual neurons to a range of light powers in order to estimate the minimal light power necessary for photostimulation.

#### 5 Outlook

The possibility of linking precise coding features to specific behaviors now permits questions previously limited to theory and speculation to be directly addressed. At the coarsest level, determining the essential building blocks for perception through studies of detection using both real and synthetic stimuli will define the boundaries within which to explore the perceptual quality imparted by specific neurons and patterns of activity. By manipulating sensory circuits at a behaviorally and physiologically relevant spatiotemporal scale, a relationship can be established between the perceptual space and the feature space of neural activity. However, the precision of the inferences that can be drawn about this relationship is jointly determined by the resolution of both the behavioral metrics and neural manipulations employed. Therefore, we can expect advances in both capacities as the field continues to mature.

It is inevitable that the opsins, optics, and technology supporting precision optogenetics will continue to improve, ushering in a host of possibilities for bidirectional modulation of increasingly large and diverse neural representations. Less apparent is how behavioral methodologies will adapt to meet the nuance and complexity permitted by these tools. As the dimensionality of the neural feature space explorable by optical manipulations increases, binary behavioral readouts (lick vs. no lick), and simple stimulus-reward associations (stim 1 ¼ rewarded, stim 2 ¼ unrewarded) may be insufficient, or, at best, inefficient for mapping activity features to perception.

To meet these changing demands, our research groups, among others, are exploring new paradigms, as well as adapting existing methods traditionally overlooked for use in rodent behavior that increase the information gained about perceptual quality from each judgement made by an animal. Two examples include continuous report, in which an animal can smoothly adjust the synthetic stimulus to match or deviate from an internal template (potentially derived from a real stimulus), or delayed-match-to-sample, in which two stimuli are directly compared within each trial to determine if they are perceptually identical or different. These methods have the advantage of allowing an animal to directly report the meaningful combination of features composing a percept (continuous report), and allowing many synthetic and natural stimuli to be compared within one experimental session (delayed-match-to-sample). The challenge remains to optimize methods for compatibility with head-fixed behavior and to accelerate training by exploring more intuitive readouts of choice.

A related, and equally important problem involves determining which features to manipulate (which neurons, timing, number of spikes, etc.) to maximize information gained from finite length experiments. As the number of neurons and features that can be addressed in a single experiment increases, screening the perceptual impact of all combinations of features will become prohibitively time consuming. Therefore, it is important for experiments to be guided by models implicating which neurons and combinations of features are likely to have the greatest perceptual impact. One recent method involves calculating the "intersection information" of neurons, by using a statistical approach to identify activity features carrying stimulus and choice information during a sensory guided behavior, laying out predictions for how modulation of features carrying intersection information will affect behavior [59]. Another approach is to infer the functional connectivity of a local circuit from the effects of focal stimulation of component neurons [28]. Using the inferred structure, one could predict how the effects of stimulation will propagate through the circuit, and the stimulation could be designed to activate specific modes of activity, testing their effect on perception. Ultimately, the emergence of newly informative behavioral paradigms along with novel conceptual frameworks will likely be some of the most exciting outcomes to come from the application of precision optogenetics and synthetic perception.

#### References


activity by sculpted light. Proc Natl Acad Sci U S A 107(26):11981–11986


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License ), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## Spectrally Focused Stimulated Raman Scattering (sf-SRS) Microscopy for Label-Free Investigations of Molecular Mechanisms in Living Organisms

## Tama´ s Va´ czi, La´ szlo´ Himics, Matteo Bruzzone, Miklo´ s Veres, and Marco dal Maschio

#### Abstract

Stimulated Raman Scattering (SRS) microscopy is a light-based non-linear imaging method for visualizing a molecule based on its chemical properties, i.e., the vibrational energy states reflecting the molecule's structure and its environment. This technique, relying on the specificity of the molecule's spectral fingerprint, enables label-free, high-sensitivity, and high-resolution 3D reconstruction of the distribution and the properties of a molecule within a tissue. Despite its tremendous potentials, the application of SRS is still not frequent in the field of life science, where it could be applied over an extremely broad investigation range, from the study of the molecular interactions at subcellular level to the characterization of tissue alterations in clinical studies. Trying to fill this gap, here, after describing the general principles of SRS, we present the materials and the methods to integrate spectrally focused Stimulated Raman Spectroscopy (sf-SRS) on commercial multiphoton microscopes and highlight the critical aspects to consider.

Key words Label-free imaging, Stimulated Raman Spectroscopy, Spectral focusing

#### 1 Introduction

Optical microscopy is currently a fundamental technique in biomedical research. The combination of high resolution and low invasiveness, along with tissue- and cell-labelling methods, allows the reconstruction of anatomical or functional images with high contrast and temporal resolution compatible with a large range of biological processes. Most frequently, the mechanisms leading to the contrast are taking advantage from the interaction of excitation light with molecules that hold or are conjugated with fluorescent moieties (label-based) and result in signals with high signal-tonoise ratio levels [1].

In parallel with label-based optical imaging methods, there are techniques, relying on different kinds of light-matter interactions,

Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8\_13, © The Author(s) 2023

that can reveal the distribution and the properties of a molecule without the need of a specific labelling (label-free) [2]. Within this last group, Raman spectroscopy [3–5] represents a powerful tool for label-free characterization of a sample. In this technique one can reconstruct the molecule distribution and properties based on the features of its vibrational states as revealed by the inelastic scattering mechanisms of the excitation light by the sample [6]. In fact, following the perturbation imposed by the interaction of the photon with the molecule, the system reaches a so-called intermediate state with the molecule in a virtual energy state, from which the relaxation involves excitation of characteristic molecular vibrations. As a consequence, the scattered photon emitted during the relaxation will have energy different from the incident one. The energy gap characterizing the relaxation transitions toward ground-state (being equal to the energy of the excited vibrational transition) is characteristic of the chemical bonds of the molecule and of their environment. By accumulating these relaxation events to gather extended statistics, it becomes thus possible to retrieve a Raman spectrum, i.e., a distribution of the transition probability, returned by the signal intensity, as the function of the energy difference associated with the vibrational states, expressed in terms of the wavenumber or the corresponding Raman shift. Since such type of spontaneous inelastic scattering emission represents a minimal component of the total (elastic and inelastic) scattering events and is statistically not frequent, with typically one Raman photon emitted over 10<sup>7</sup> incident photons, the detection of these events becomes possible only with rather long integration times. As alternative, the Raman spectrum can be explored adopting a pumpprobe approach [7]. In this scheme of the Raman process, called stimulated Raman scattering (SRS) [8, 9], the probability of vibrational transitions toward the ground level is enhanced strongly, by up to a few orders of magnitude, with respect to spontaneous Raman scattering, taking advantage from a non-linear excitation process of stimulated emission.

#### 1.1 Stimulated Raman Spectroscopy (SRS)

Stimulated Raman scattering (SRS) microscopy represents a labelfree technique for imaging with high chemical specificity and high acquisition speeds up to video rate [10]. In SRS microscopy, two pulsed laser beams are temporally and spatially overlapped to coherently excite the sample: a pump beam with the frequency of ω<sup>p</sup> and a Stokes beam with the frequency of ω<sup>S</sup> (Fig. 1a). In the condition where the frequency Δω ¼ ω<sup>p</sup> ω<sup>S</sup> matches a particular Ramanactive molecular vibration energy level of the sample, the system enters a resonance regime where SRS signals are generated. These can be detected either in the form of the decrease of the pump beam intensity called Stimulated Raman loss (SRL), corresponding to the annihilation process of the pump photon, or in the increase of the Stokes beam intensity, known as Stimulated Raman gain (SRG),

Fig. 1 (a) The SRS Jablonsky diagrams showing the energy states and the transitions associated with Spontaneous Raman and Stimulated Raman signals with the y-axis indicating the energetic level as represented in terms of the radiation frequency. (b) Changes of the laser intensities during Stimulated Raman Spectroscopy in the case of Stimulated Raman Gain (SRG) and Stimulated Raman Loss (SRL). (c) Frequency-time properties of ultrashort Stokes and pump pulses with chirping parameter <sup>β</sup> <sup>¼</sup> 0 and the corresponding spectral resolution below. (d) Dispersion of the spectral component in chirped pulse with <sup>β</sup> 6¼ 0 with matched chirping condition and the corresponding spectral resolution below. (e) Dispersion of the spectral component in chirped pulse with <sup>β</sup> <sup>&</sup>gt; 0 with un-matched chirping condition and the corresponding spectral resolution below. (f) Dispersion of the spectral component in chirped pulse with <sup>β</sup> <sup>&</sup>gt; 0 with matched chirping condition with different temporal shift and the corresponding spectral window. Time axis corresponds to the time of arrival of the spectral components at the same point in the space

corresponding to the creation of a photon with ω<sup>S</sup> frequency (Fig. ). While in the favourable conditions, the existence of a resonance reveals the presence of a molecule with a vibrational transition corresponding to the set energy difference Δω, when the energy difference Δω does not match any Raman active vibrational transition, such resonance does not occur and no SRS signal is detected. In this way, scanning over the energy interval by 1b

changing Δω allows reconstructing the Raman spectrum contained in the excitation volume. Importantly, with respect to spontaneous Raman signals, as SRS is a non-linear absorption process based on the simultaneous interaction of two beams, with frequency ω<sup>p</sup> and ωS, the illumination of the sample is capable of revealing the vibrational configuration of the molecules located in a rather small volume corresponding to the region with high spatial and temporal density of photons. This property, ensuring a high spatial resolution, renders SRS a technique suitable for its integration in scanning microscopes designed for the three-dimensional characterization of the sample properties.

A molecule, in general, is characterized by a certain number of vibrational transitions and properly identifying the molecule of interest through its characteristic vibrational levels requires the scanning of the energy range across the whole Raman spectrum with sufficient spectral resolution. In this regard, one has to consider that the limit in the spectral resolution is related to the spectral bandwidth of the radiation, δω, and that, in the case of pulsed sources, it does exist a fundamental relationship between the spectral bandwidth δω and the duration of the pulse, δτ [11]. This relation is known as time-bandwidth product and dictates that the product δω·δτ is constant with a value depending on the shape of the source power spectrum. Under this constrain, given a certain δω, the shortest pulse duration δτ corresponds to a pulse where all the spectral components within δω have the same phase. Combining beams of such ultrashort pulses—with typical pulse duration in 80–180 fs—in the pump-probe scheme of the SRS, results in a scenario where the range of accessible transitions is large as it encompasses all the possible energy levels (Fig. 1c). However, even if the original pulses have the features described above, it is still possible within the constraint of the time-bandwidth relation, to reduce the effective δω amplitude and so increase the SRS system spectral resolution. This approach takes advantage of a technique called pulse chirping, where a phase delay is introduced between the different spectral components within δω resulting in a time dependent evolution of the frequency ω(t) ¼ ω<sup>0</sup> – 2βt governed by the chirp parameter β (Fig. 1d) that describes the temporal evolution and dispersion of the spectral components within the pulse and where ω<sup>0</sup> is the carrier frequency of the envelope [11]. The obtained temporal dispersion of the spectral components included in the original δω, along with extending the pulse duration in the ps range, has the effect in reducing the accessible effective δω band [12]. If pulse chirping is applied to the pump and the Stokes beams, the spectral resolution of the SRS acquisition is expected to improve up to <10 cm-<sup>1</sup> and to be able to pick transitions with a narrower bandwidth, as far as the two beams have pulses temporally overlapping and with a chirp condition properly matched (Fig. 1d).

1.2 Spectral Resolution and the Role of the Pulse Duration, Chirping and Delay in SRS

1.3 Spectrally Focused SRS: The General Hardware

Layout

Matching the chirp condition basically means that the temporal evolution of the pulse from the two beams are characterized by the same chirp parameter β. It is important to keep in mind that when the condition is matched, the energy difference Δω ¼ ω<sup>p</sup> ω<sup>S</sup> between the pump and Stokes beam is the same across the region of the overlap between the pulses. When the chirp condition is not satisfied, it implies that the effective sensitivity is not optimized, the δω is not at the minimum, and that also Δω covers a relatively large range (Fig. 1e). As mentioned above, along with the chirp condition, also the degree of temporal overlap between the pump and Stokes pulses holds an important role, not regarding the spectral sensitivity, but rather for its effect on the range of the spectrum one can explore. In fact, introducing a temporal delay of one pulse with respect to the other results in the selection of different subregions of the chirped pulsed that are temporally overlapping. This means that it is possible to identify a relation between the relative delay of the pulses and the corresponding spectral region one can examine (Fig. 1f). This method of modulating the interpulse delay represents a fundamental mechanism for scanning over the spectrum by avoiding the retuning of the beam frequency, which typically requires a significantly longer time and possibly a fine re-adjustment of the optical system.

Implementing a system for SRS with spectral focusing capabilities within a microscope for 3D beam scanning typically requires the integration of a set of add-on systems: (1) the optical pathway for the routing, the temporal control, the high-frequency modulation, and the conditioning of the pump and the Stokes beams; (2) the implementation of a dedicated signal detection apparatus; (3) the arrangement of a signal processing chain based on a lock-in amplifier to return the SRS signal intensity corresponding to the illuminated spot ("pixel"). While these aspects will be described in detail in the Methods session, here we present the basic principles guiding the sf-SRS implementation (Fig. 2). SRS relies on a pump-probe scheme based on the spatial and temporal superposition of the pulses of two beams characterized by different frequencies, i.e., ω<sup>p</sup> and ωS. As for the spatial overlap, this requires the use of ultrafast routing mirrors, to ensure the XY (directions transversal with respect to the light propagation direction) collinear overlap of the two beams at the sample, and the adoption of beam expanders (BE), to properly match the diameters and the degree of divergence/convergence of the beams in order to have a good overlap of the focal spots along the Z direction (longitudinal direction with respect to the light propagation direction). Along with spatial overlap of the beams, SRS requires in general a precise control of the temporal overlap between the pulses of the two beams. It is then fundamental that the pulses for pump and for Stokes beams are emitted with a fixed phase delay one from the other. This relies

Fig. 2 The hardware layout for the integration of a sf-SRS in a laser scanning microscope. The main components include: the time-locked pulsed laser sources (pump and Stokes beam), a Group Delay Dispersion (GDD) system, a laser Intensity control (I) unit, a Pulse Stretching Apparatus (PSA), a set of Beam Expanders (BE), an Acousto-Optic Modulator (AOM), a delay line, a Balanced Detection Component (BDC), a Dichroic Mirror (DM), a Photomultiplier Tube (PMT), a Polarization-sensitive Beam Splitter (PBS), a pair of Photodiodes (PD), a differential amplifier and a Lock-in amplifier. On the upper part, the temporal features of the Stokes beam pulse and of the pump beams, presenting an orthogonal polarization and a temporal delay in the pulse replica with respect to the Stokes pulse interacting with pump beam, are presented

on the use of two light sources or a single source with dual output channels, whose pulse emissions are synchronized. Having synchronized pulses does not imply necessarily that the pulses are overlapping at the sample and, in general, it is not granted that the amount of delay between the pulses remains the same when varying one of the emission frequencies. Considering these aspects requires a careful design of the optical paths (Fig. ): first optimizing the difference in the distance travelled by the two beams in order to minimize the range of possible delays; second, integrating a delay line in one of the beam paths for finely tuning the residual delay excursion range and getting the pulses overlapping at all the selected frequencies. Along each of the two beam lines, it is practical to have a system for attenuating the laser intensity (I) and keep the power for the two beams under control. As discussed in the previous section, in the case of the sf-SRS the proper tuning of the 2

pulse temporal properties becomes extremely important. With respect to SRS using ultrashort pulses, properly tuning the delay line in sf-SRS not only affects the temporal overlap but also has an impact on the selection of the proper spectral window when using chirped pulses (Fig. 1f). The temporal constrains required for the chirped pulse sf-SRS and the fact that the original pulses from the two sources could potentially not present the same exact temporal properties impose a design of the optical path, ensuring a precise control of the chirp parameter on both beams. Assuming that the sources generate ultrashort pulses, stretching the pulse to obtain a sufficient degree of positive chirping is typically achieved by integrating optical elements with dispersive properties (pulse stretching apparatus, PSA; see Materials section), so that the spectral components travel within such materials at different speeds and acquire a different delay as a function of frequency. As the properties of the pulse can vary with excitation frequency, it is then mandatory to insert a component in one of the beams for the fine-tuning of the chirp parameter β by controlling the group delay dispersion (GDD). This fine-tuning of the pulse dispersion can be achieved with an external device (pre-chirper) or by taking the advantage of the built-in GDD-control unit that is available in certain sources.

Regarding the configuration of the signal detection, one has to consider that, with respect to other techniques where excitation and detection channels are spectrally separated, the SRS signal is represented by a subtle variation in the intensity of one of the excitation beams (SRL or SRG). This implies that the detection chain should be able to discriminate such intensity variations associated with the Raman process with respect to an average level over the typical signal fluctuations intrinsic to the sources, even in cases when system noise is substantially larger. Under these circumstances, the limiting factor in the detection is not the sensitivity of the detector but rather the capability to extract the valid signal from a very large background. This is achieved by combining a photodiode (PD) as a detector with the detection optics and feeding its output through a signal amplification chain using a lock-in amplifier. This type of amplifiers implements a phase-sensitive detection [13], which is capable of amplifying massively the signal at a specific frequency, suppressing the contribution of the other frequency components and extracting in this way the contribution due to Raman interaction with the sample. For this purpose, the lock-in detection approach requires the insertion of a high-frequency intensity modulation device (typically an acousto-optic modulator, AOM), capable of modulating the intensity of one of the excitation beams at the designed driving frequency. The signal acquired by the PD, independently by the configuration adopted, will be carrying the same high-frequency modulation and this will allow for the detection of subtle changes in the intensity associated with a Raman process. The output of the lock-in amplifier is ultimately sampled by the acquisition system of the microscope and is synchronized with the scanning system to form an SRS image.

1.4 Application Perspectives sf-SRS is a label-free imaging technique that requires a dedicated type of hardware and software to control the acquisition. Moreover, the optimal tuning of the many critical parameters could be very time consuming if the system does not envisage some sort of automatic regulation and re-calibration of the components. It is then not surprising that the fields of application of this technique are typically falling within physics or chemistry labs, with applications that, in most of the cases, deal with chemical or material sciences. Out of this scenario, the number of the questions that such techniques could help answering is relevant, especially in the field of life sciences. This technique can be applied to a very broad range of investigations, from the mapping of the molecular mechanisms inside a cell to the characterization of tissues in living organisms, along with the alterations associated to different pathological conditions [14]. The use of this technique has been reported for mapping the neuronal activity in a label-free manner [15, 16] and in its integration with approaches of micro-endoscopy [17, 18]. Research in human biopsies or human-derived tissues is becoming more frequent and taking advantage of techniques, like SRS, for characterizing structural and molecular properties. Moreover, the identification of strategies for having a label-based Raman signal is opening the scenario for multichannel analysis [19].

#### 2 Materials

Here we report a list of components to assemble a sf-SRS system and integrate it into a commercial scanning microscope. All components not labelled otherwise are from Thorlabs (Newton, New Jersey, USA).

General Optomechanical Assembly Post holders (PH75E/M, UPH30/M); posts (TR50/M, TR20/M); clamping forks (CF038, CF125); kinematic mirror mounts (KM100 or KS1, KM100DL/M); kinematic mounts for rectangular optics (KM100S, KM100SL); kinematic platform mounts (KM100B/ M, KM200B/M); kinematic prism mounts (KM100PM/M); clamps (PM4/M); 30 mm cage cube (C4W) with blank cover plate (B1C/M), kinematic rotating platform (B4CRP/M) and dichroic filter holder (FFM1); rotation mount (CRM1); iris diaphragms (SM1D12D, ID8, ID15); cage plates (CP33/M, CP33T/M, CP4S); kinematic, cage-compatible mounts (KC1-T/ M); cage rods (ER1, ER1.5); zoom housings (SM1ZM); thread adapters (SM1A6, SM1A6T); rotation mounts (DLM1/M, RSP1D/M), SM2 lens tube coupler (SM2T2); thread adapter (SM1A2); lens tubes (SML05, SML10, SML15); SM1 lens tube couplers (SM1T1); SM1 flexure sleeve lens tube couplers (SM1CPL10); Ø1.5" mounting post (P150/M); table clamp (PF175B); Ø1.5" post mounting clamp (C1511/M); compact 5-axis stage (PY005).

Mirrors Low-GDD ultrafast mirrors (UM10-45B in the Stokes arm, UM10-AG for the pump arm and the common beam path); ultrafast retroreflecting (hollow roof) mirrors (HRS1015-AG); knife-edge right-angle prism silver mirrors (MRAK25-P01); D-shaped silver mirror (PFD10–03-P01); for combining the pump and the Stokes beams (DM), a 25.2 35.6 mm dichroic long-pass filter, 1000 nm cut-on (#87–046, Edmund Optics, York, UK).

Divergence Correction/Focusing Elements Achromatic NIR lenses (ACN127–020-B, AC254–040-B); variable beam expander (BE052-B).

Balanced Detection Reference Arm Mounted achromatic halfwave plates (AHWP05M-980 or AHWP10M-980); mounted polarization-sensitive beamsplitter cubes (CCM1-PBS255/M).

Balanced Detection Substage Assembly Olympus 1.4 NA oil condenser (U-AAC); premium long-pass filters (FELH1000); mounted polarization-sensitive beamsplitter cube (CCM1- PBS255/M); achromatic NIR lenses (AC254–035-B); mounted photodiode detector (SM05PD1A) with bias module (PBM42).

Delay Stage Direct-drive linear translation stage (DDSM100/M) with programmable controller (KBD101).

 Dispersive Glass High-dispersion glass blocks (S-TIH53, OHARA, Hofheim, Germany), polished on both ends, 1 pc 36 36 200 mm, 1 pc 15 25 75 mm.

Acousto-Optic Modulator MT110-B50A1,5-IR-Hk AOM cell + MPDS1C RF driver (AA OPTO-ELECTRONIC, Orsay, France).

Sources Chameleon Discovery source with dual output (one fixed at nominal 1040 nm, one tunable between 680 and 1300 nm), 80 MHz repetition rate, 100–140 fs pulse length, ca. 4 W (Stokes/ probe) and 1.6–1.8 W (pump) power, built-in GDD precompensation (Coherent, Santa Clara, CA).

402 Tama´ s Va´ czi et al. Microscope FEMTOSmart research-grade two-photon (2P) galvanometer scanning microscope with one- or two-channel 2P-epi detection (PMT) (Femtonics, Budapest, Hungary) and custom-made, transmission SRS detection; Nikon NIR APO 40 /0.80 W objective.

> Auxiliary Equipment CARPE autocorrelator with external detector (APE, Berlin, Germany); SPD2122X arbitrary waveform generator (Siglent, Solon, OH); SR865A lock-in amplifier (Stanford Research Systems, Sunnyvale, CA); laboratory power supply, 24 V (AOM driver) and 32 V (photodiode bias) outputs. MES control software (Femtonics, Budapest, Hungary) based on MATLAB (MathWorks, Natick, MA).

### 3 Methods

3.1 Main Components for Integrating sf-SRS in a Multiphoton Microscope

Figure 3 shows the schematic of an SRS microscope with singlechannel lock-in detection. It consists of the tunable wavelength pump (TUN) and fixed wavelength Stokes (FIX) laser beams that, in case of femtosecond laser sources, are guided along their light paths with ultrafast-enhanced mirrors. These optical elements are designed so that they introduce minimal distortions to the wavefronts of the ultrashort laser pulses during their reflection. Mirrors are used to realize the criteria of SRS excitation described above, namely to create the extra light path for one of the beams for the coarse compensation of the relative delay of the pulses, to couple the beam(s) in and out of the delay line, acousto-optical modulator

Fig. <sup>3</sup> Schematic layout of an SRS microscope system. FIX – fixed wavelength Stokes light source, TUN – tuneable wavelength pump light source, M – mirrors, PD – photodiode, AUX in – signal input of the microscope for imaging, AOM – acousto-optical modulator

and other components, as well as to adjust their position and angle in order to obtain collinearly and concentrically aligned combined beams delivered to the microscope. Before entering the microscope, the pump and Stokes beams are combined with a dichroic mirror (optical element with transparent or reflective properties in certain wavelength regions) reflecting one beam (the pump on Fig. 3) and transmitting the other (the Stokes on Fig. 3). For the precise alignment of the two beams, two steering mirrors are inserted into the light path of the reflected beam before the dichroic mirror (not shown here). In general, including a pair of steering mirrors before the main components of the SRS and any other optical system (delay line, AOM, dichroic mirror, microscope) simplifies the alignment and fine tuning of the beams.

The temporal synchronization of the pulses and the compensation of its variation with the emission frequency and GDD setting of the light source is performed by the delay line. Since the delay between the pulses of the pump and Stokes beams depends on the wavelength, the delay line is usually realized by micrometer resolution motorized linear stages. The optical components of this unit could be mirrors or the combination of a knife-edge and a hollow roof mirror. In order to avoid the lateral drifting of the beam, the optical paths to and from the moving mirrors are aligned to be parallel to the movement of the delay stage.

In the microscope the collinear, divergence-corrected and temporally synchronized pump and Stokes beams are delivered and moved in lateral direction (scanned) at the sample with the same optical components (galvo scanners, scan lens, tube lens, and microscope objective) used for multiphoton imaging. The detection of the SRS signal is more conveniently performed in the forward direction (transmission geometry). The light from the sample is collected with a condenser or an objective lens of high numerical aperture (NA) (as a rule of thumb, the NA of the light collection optics has to be equal or larger than that of the excitation side) and directed to the detector. Since a high laser intensity is detected during an SRS measurement, in most cases a biased PD is used to record the signal. An optical filter is placed in front of the PD allowing only one of the beams to reach the detector surface. Depending on the configuration, the intensity change in the pump (SRL detection) or Stokes (SRG detection) beam is recorded.

As mentioned in the previous session, in most cases SRS systems utilize lock-in detection. Therefore, the intensity of the pump (for SRG detection, when the Stokes beam is detected) or the Stokes (for SRL detection, when the pump beam is detected) beam must be modulated with a driving signal of a given frequency. This can be realized by using a chopper, an acousto-optic (AOM) or electro-optic modulator (EOM). For high-speed imaging, the frequency of the reference signal could be modulated at a few MHz, to operate far from the 1/frequency noise region of the source and evaluate the SRS signal associated with the pixel over a sufficient number of modulation cycles.

An SRS measurement starts with the adjustment of the wavelength of the tunable light source, followed by setting the corresponding delay line position and turning on the intensity modulation. Then the intensities of the pump and Stokes beams have to be adjusted. As a rule of thumb, the intensity of the modulated beam has to be twice the other, and, in general, the higher the laser intensity, the stronger the SRS effect. However, in practice the laser-induced damage of the sample and the appearance of different artefacts (Kerr effect, cross-phase modulation, etc.) limit the useful laser intensity levels.

The net GDD (chirp) of the pulses comes from three main sources: the chirp introduced within the laser, the optical setup including the SRS system and the microscope, and the additional element (s) used to achieve pulse stretching. The contribution of these is difficult to calculate or estimate precisely, therefore, the experimental measurement of the spectral and temporal widths of the laser beams passing through the SRS system, and the microscope has to be performed with a spectrometer and an autocorrelator, respectively. The results can be used to determine the amount of chirp to be introduced into the pump and Stokes beams to achieve the chirp matching condition.

While the chirp of the emitted pulses cannot be adjusted directly for some femtosecond lasers, other systems include the option of compensating the group delay dispersion introduced by the optical elements. This is a convenient tool to fine-tune the net GDD and to establish precise chirp matching in spectrally focused SRS measurements under a variety of laser wavelengths, i.e., Raman sampling ranges.

The chirp related to the optical components of the SRS microscope is an intrinsic property of the system that has to be taken into account when adjusting the chirp matching conditions. The main source is represented by transmissive optical elements—divergence correctors, filters, beam splitters, objective lens, etc.

The third, and most substantial, component is the pulse stretching device. In practice, the temporal redistribution of the spectral components within the pulse can be achieved by using combinations of gratings, prisms, or multilayered mirrors. Another approach is to use a dispersive medium, in which the components of the laser pulse with different wavelengths will travel with slightly different velocity. The velocity difference per unit length of the dispersive medium is given as the Group Velocity Dispersion (GVD), and GDD ¼ GVD∙length of the medium. As a consequence, a chirp will be introduced into the pulse, the extent

3.2 Methods for Spectral Tuning of the System and Its Optimization: Pulse Chirp and Delay Control


Table 1 Group velocity dispersion of the Ohara S-TIH53 high density flint glass at different wavelengths [20]

of which can be adjusted by the length of the dispersive medium. With a proper medium of high GVD (for example flint glass), this solution can be very effective and cost-efficient.

In general, the GVD of dispersive media decreases rapidly with increasing wavelength and can be very low for the near-infrared region characteristic for many femtosecond lasers. Therefore, a longer medium has to be inserted into the beam of higher wavelength to achieve the same amount of chirp. High-density flint glass could be a material of choice for pulse chirping providing acceptable GVD values in a broad spectral range (see Table 1). After establishing the amounts of the chirp to be introduced into the pump and Stokes beams, the glass can be cut to the required length and the cut faces can be polished flat and coated with an antireflective coating in order to minimize the losses and wavefront distortions of the beam.

Once the chirp introduced by the laser and the optical setup are known, it is possible to calculate the amount of chirp to be applied to the two beams for the chirp matching along with the achievable spectral resolution Δωcc depending on the material length [21–23], according to the following formulas:

$$
\tau = \tau\_0 \sqrt{1 + \left[\frac{4 \ln 2 \cdot \text{GDD}}{\tau\_0^2}\right]^2} \tag{1}
$$

$$\beta = \left[ \left( \frac{\Delta \phi}{\tau} \right)^2 - \left( \frac{2 \ln 2}{\pi c \tau^2} \right)^2 \right]^{1/2} \tag{2}$$

$$
\Delta o\_{\ell\varepsilon} = \left[ \frac{(2\ln 2)^2 \left(\tau\_S^2 + \tau\_P^2\right)}{\pi^2 c^2 \tau\_P^2 \tau\_S^2} + \frac{\tau\_P^2 \tau\_S^2 (\beta\_P - \beta\_S)^2}{\left(\tau\_S^2 + \tau\_P^2\right)} \right]^{1/2} \tag{3}
$$

where τ<sup>0</sup> is the pulse duration in the case of a transform-limited Gaussian pulse, c is the speed of the light in vacuum, and τ<sup>P</sup> and τ<sup>S</sup> the relative length of the Pump and Stokes pulses, respectively.

An important point here is that due to its non-unity refractive index the optical medium inserted into the beams will affect the relative delay of the pulses. For the compensation of this, the additional optical path will have to be calculated by multiplying the length of the medium with its refractive index (the refractive index indicates how slower the light will travel in a given medium, so how much the pulse will be delayed compared to the vacuum).

Fig. 4 Chirp matching as a function of glass length (green line; Lp: glass length in pump beam, with additional 300 mm glass in Stokes beam). At the value of <sup>β</sup>P/βS <sup>¼</sup> 1, the difference between the instantaneous frequencies of the temporally overlapping pulses is constant. The violet line shows the calculated best spectral resolution without distortion effects

An example calculation of chirp matching condition for the Coherent Chameleon Discovery laser with Ohara S-TIH53 high density flint glass is shown in Fig. 4. The green line represents the change of the ratio of the β<sup>p</sup> pump and the β<sup>S</sup> Stokes chirp parameters with the length Lp of the glass the beams travel through. The chirp matching condition (when the βp/β<sup>S</sup> ratio of chirp parameters of the pump and Stokes beams is 1) is shown around 500 mm glass base length. One can also calculate the spectral resolution, equivalent to the cross-correlation bandwidth (Δωcc), which has its minimum close to the chirp matching condition (purple curve in Fig. 4).

The layout of the optimized spectral focusing unit is shown in Fig. 5. For the compactness of the system, one 38 38 200 mm glass rod is used for both the Stokes and pump beams, polished on both 38 38 mm ends. Combinations of knife edge and hollow roof mirrors are used to direct the two beams into the glass rod and make them pass through several times, and then return into the original beam path of the SRS system. Two hollow roof mirrors are used for the Stokes beam (length: 800 mm): after the first pass a

Fig. 5 Schematic layout of the SRS spectral focusing unit

horizontally oriented mirror was used to reflect the beam above the height of the first pass. At the end of the second pass another, now vertically oriented hollow roof mirror is used to reflect it in the height of the second pass, after which the first mirror reflects it onto the knife edge mirror in the height of the first pass. So, the beam is passing the 200 mm long glass four times. The tunable beam passes the 200 mm glass rod only twice. Here a shorter rod of 75 mm length is also used to achieve the required 550 mm (2 200 mm + 2 75 mm) of glass length. The distances between the knife edge and the hollow roof mirrors are adjusted to approximately maintain the temporal synchronization of the Stokes and pump pulses.

The knife edge mirrors couple in and out the laser beams to the spectral focusing unit. If needed, e.g., for femtosecond two-photon measurements, these mirrors can be mounted on magnetic, detachable mounts so the system can be used in both femtosecond and spectral focusing modes.

A spectral focusing SRS can be used to record vibrational spectra of the samples. As it was detailed in the introduction, by tuning the relative delay between the spectral focused pulses, they will excite different vibrational transitions without tuning the wavelength of the laser. By recording a calibration Raman spectrum on a known, suitable sample, the delay line positions can be converted into Raman shift values after appropriate curve fitting. Figure 6 demonstrates this capability and the high spectral resolution of a sf-SRS microscope on succinic acid, a compound with intrinsically narrow Raman lines. The resolution of the system was found to be on par with the calculated resolution of 8 cm-<sup>1</sup> for this system (cf. Figure 4, purple line minimum at 500–550 mm). Note that a known chirp parameter at the chirp matching condition can also be

Fig. 6 Spontaneous Raman and sf-SRS spectra of succinic acid. The raw SRS spectrum (green) is shown with respect to the delay line position on the upper X-axis and with respect to the corresponding Raman shift. With a linear chirp, the stimulated Raman shift changes linearly with delay stage movement. The spontaneous Raman spectrum (blue) is shown as reference

used to convert the relative delay stage movement to a difference on Raman shift: the dimensions of β [cm-1 /fs] can easily be converted to [cm-1 /μm].

Multispectral images can also be obtained simply by performing imaging with different delay line positions. Figure 7 shows the SRS spectrum of a live zebrafish larva corresponding to the anatomical region of the Tectum Opticum in the C-H fingerprint region. The intensities at 2845, 2931, and 2967 cm-1 (marked with colored lines) were found to have good correlation with lipid, protein, and DNA signatures, respectively [24]. The corresponding SRS images are shown on the right side.

3.3 Methods for Optimal Signal Detection: Differential Detection In the Differential Detection (DD), a reference signal is measured simultaneously along with the SRS one. This component is subtracted from the SRS signal in order to suppress the common noise of the laser light source. DD is an efficient solution to reduce the noise and increase the sensitivity of the SRS measurements. The main requirements of the high noise reduction are to minimize the

Fig. 7 SRS spectrum recorded from the brain of a live zebrafish with the corresponding SRS images (large field of view, upper line) and zoomed-in insets (lower line) at different vibrational levels. Scale bars are 50 (upper line) and 10 <sup>μ</sup>m (lower line)

delay between the signals coming from the reference and SRS arms and to have similar signal levels at the two detectors.

For the SRS with DD, the reference signal can be generated in several ways. In some configurations, the detected beam is split before the main dichroic mirror (Fig. 2), and some part of it is diverted into the reference detector, so the reference beam is different from the SRS one, both spatially and temporally. As a consequence of the latter, this solution requires the insertion of a delay either into the optical or the electronic path—that compensates for the longer distance between the beam splitter and the SRS detector under the stage of the microscope. The main problem with this approach is that it cannot compensate for the fluctuations of the signal level at the SRS detector caused by the varying optical density of the sample during the scanning.

The other approach is to have the reference beam passing the same optical path as the SRS one and to place the reference detector also under the sample. This allows for the two components to have the same intensity change upon passing through the sample of varying optical density. The reference beam can be realized by creating a delayed replica of the pulses in the pump or Stokes beam (the one detected) that will not be synchronized temporarily with the pulses of the other beam [25, 26]. Then, the only difference between the pulse and its replica is that the intensity of the former is affected by the SRS interaction, so their differential detection will result in efficient common mode noise suppression. The two arms of the balance detection can be conveniently multiplexed and de-multiplexed taking advantage orthogonal linear polarization states.

Fig. 8 Schematic layout of the differential detection. HWP – half-wave plate, M – mirror, PBS – polarizationdependent beamsplitter, A-B detector – differential detector

Figure 8 shows the schematics of the DD with pulse replica generation for the Stokes beam. The replica generation unit is inserted along the beam that is used for the detection of the SRS signal, in a position upstream the dichroic mirror combining the pump and Stokes beams (DM in Fig. 2). The combination of a halfwave plate (HWP) and a polarization-dependent beamsplitter (PBS) is used to split the beam into two components of orthogonal polarization (the horizontally polarized beam is transmitted, the vertically polarized is diverted in Fig. 8). The HWP rotates the polarization of the laser, then the PBS separates the components with orthogonal polarizations. The intensity ratio of the two components will be determined by the angle of rotation. The diverted beam is directed into another PBS performing the collinear combination of the two beams separated earlier. However, due to the additional distance the diverted beam travelled, the two pulses are delayed—the horizontally polarized goes ahead of the other. So, after the replica generation unit, the beam will consist of two, orthogonally polarized pulse trains, with direct and replica pulses delayed typically by a few hundred ps. An additional HWP is also inserted into the replica arm between the two mirrors allowing the independent control of the intensity of the replica pulses.

The obtained Stokes beam is combined with the pump on the dichroic mirror (DM) and introduced into the microscope. Both the original and the replica pulses pass through the same path and the former undergoes SRS interaction at the sample. Due to the delay, the latter has no paired pulse for the SRS to occur. A PBS is used after the sample to separate the orthogonally polarized SRS and non-SRS beams that are directed to the SRS and reference arms. Both arms contain identical detectors and filters. The path length between direct and replica beams can be compensated by the length of the cabling used in the detection electronics.

The efficient common mode noise suppression requires the same laser intensities in the two arms. This can be adjusted with the rotation of the two HWPs. The use of two laser intensity regulators allows very precise tuning of the intensity ratio.

Fig. 9 SRS image recorded on 2 micron sized polystyrene microbeads with (upper part) and without (lower part) differential detection

The effect of DD on the signal-to-noise ratio of the SRS can be seen in Fig. 9. The upper part of the SRS image on polystyrene microbeads was recorded in differential detection mode, while the system was switched to single channel detection in the lower. A remarkable difference can be seen in the SRS image quality of the two parts, since the signal-to-noise ratio is significantly reduced due to higher intensity noise.

#### 4 Notes

#### 4.1 Signal-to-Noise Ratio of SRS Images

The signal-to-noise (S/N) ratio is a useful parameter to characterize the SRS images. The recording of a good-quality SRS image requires many parameters of the SRS system to be adjusted to the optimal setting, including the temporal and spatial overlap of the pump and Stokes beams, lock-in detection parameters including the modulation depth, frequency and filtering, pump and Stokes beam intensities and their ratio, integration time, etc. The measurement of the S/N ratio on a dedicated sample on regular basis or at the beginning of every session could provide valuable information on the current status and performance of the SRS system.

Figure 10 below shows an SRS image recorded on polystyrene microbeads of 5- and 10-micron size. It can be seen that the image is of good quality and the microparticles are highly visible. This profile represents the intensity of each pixel belonging to the selected line and can be used to calculate the S/N ratio. The signal and noise levels are marked with red lines, in these regions the mean values and the standard deviations are calculated, and the S/N ratio is obtained as the difference between the mean values divided by the standard deviation corresponding to the background noise region. This particular example (Fig. 10) shows a S/N ratio over 250.

Fig. 10 High quality sf-SRS image recorded on polystyrene microbeads and an intensity profile measured on the top center microparticle

4.2 Strategies for Ensuring Optimal Spatio-Temporal Overlap Between Pump and Stokes Beams

SRS requires tight focusing of the pump and Stokes beams. This criterion is very strict; the pulses should be in the very same focal volume at the same time. Several factors could affect the tight focusing of the beams (even in an optimally adjusted SRS system) that could have temporary or long-term effect, including the temperature and air density fluctuations, pointing stability of the two beams, deformation of the mirror surfaces due to heating by the intense laser beams, etc. On tunable SRS systems, the wavelength dependence of the depth of focus of many objective lens could be another issue.

The above problems can be addressed in several ways. The local temperature and air density fluctuations in the surrounding of the SRS system can be minimized by using air conditioning with precise temperature control, proper shielding (closed box) around the optical setup, and proper cooling of the overheating parts.

The pointing stability of the two beams can be improved by integrating an automatic beam stabilizer into the optical paths of the SRS system. The beam stabilizer for a single beam consists of two steering mirrors, each with high-speed and high-resolution angular adjustment capabilities in two orthogonal directions (realized through stepper motors or piezo drives) and two laser beam position sensors that can detect the lateral misalignment of the laser beam with high precision (e.g., a camera or a quadrant detector). The two detectors are placed at a large distance from each other into an auxiliary arm obtained by placing a beam picker or a beam splitter into the main beam.

Fig. 11 Dependence of the divergence of a laser beam of 800 nm on the distance of the two lenses of a divergence corrector telescope

The different depth of focus of laser beams of different wavelengths is related to the dispersion of the transmissive optical elements focusing the light of different color not to the same depth. This can be corrected by adjusting the divergence of the beams that will result in changes in the depth of focus. The simplest divergence corrector is a telescope, the components (lenses or curved surface mirrors) of which are slightly misaligned from the ideal position axially. This difference alters the angle of the divergence of the output beam, which subsequently will affect the depth of the focus of the beam under the objective lens. The change of the divergence of the output beam relative to the input for a 1:1 telescope is shown in Fig. 11. As the distance between the optical components changes by 0.5 cm, the divergence will also be altered by ca. 15–20%, which clearly indicates the efficiency of this unit for the adjustment of the focal depths.

4.3 Optimizing the Beam Modulation Frequency and Depth It has been described that detecting the SRS signal relies on lock-in amplifiers, which work by driving one of the excitation beams with a high-frequency modulation. This is typically achieved using an acousto-optical modulator (AOM), a crystal with a piezo actuator attached to it. It works like a dynamic optical grating, where the periodic modulation of the refractive index is created by the acoustic waves introduced into the crystal via the piezo actuator. As a result, the crystal will behave as diffractive element and will divert the collinear incident beam to the diffraction angle. By switching the acoustic waves on and off, the beam can be switched between straight and diverted states, so the intensity of both straight and diverted beams will be modulated. As for optical gratings, the

Fig. 12 Frequency dependence of poor (blue) and optimized (red) modulation depths

diverting efficiency of the AOM depends on the proper alignment, diameter, angle of incidence, etc., of the AOM. At kHz modulation rates the units with relatively large aperture can achieve high modulation depth even with poor alignment. However, as the modulation frequency increases, the conditions for the efficient modulation will be stricter and further optimization of the alignment will be required. Figure 12 compares the frequency dependence of the modulation depth of a poorly and optimally aligned AOM module. It can be seen that while the value decreases rapidly for the poorly aligned setup (<0.8 above 200 kHz, <0.6 above 1 MHz), efficient modulation can be achieved even at frequencies of few MHz with optimized alignment (>0.9 up to 4 MHz).

#### Acknowledgments

The authors would like to acknowledge the support of the Department of Biomedical Sciences (SID2018, Dal Maschio) and the Padua Neuroscience Center (ReTurnPD, Dal Maschio) at the University of Padua, the support of EC Research Programs (VISGEN, Dal Maschio; NEURAM, Veres and Dal Maschio). The authors would like to thank the colleagues providing help, comments, and suggestions in drafting the content.

#### References


in noise. J Phys E 8:621–627. https://doi.org/ 10.1088/0022-3735/8/8/001


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License , which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. (http://creativecommons.org/licenses/by/4.0/)

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## INDEX

#### A


#### B


#### C


#### E

Event-related analysis of functional fluorescence traces ................................................ 149, 150, 159

#### F


#### G


#### H

Holographic microscope ..............................................105

#### I


#### L


#### M


#### N


#### O


#### P


Eirini Papagiakoumou (ed.), All-Optical Methods to Study Neuronal Function, Neuromethods, vol. 191, https://doi.org/10.1007/978-1-0716-2764-8, © The Editor(s) (if applicable) and The Author(s) 2023

### 418 ALL-OPTICAL METHODS TO STUDY NEURONAL FUNCTION Index

#### S


#### T


Two-photon imaging......................................14, 63, 102, 118, 122, 146, 172, 179, 197, 198, 208, 337, 343, 347 Two-photon optogenetics..................................123, 129, 138, 331, 337, 348

#### U


#### V


#### W


#### Z

Zebrafish...............................................51, 138, 232, 238, 247–250, 254, 307, 408, 409